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Abstract. The Furstenberg recurrence theorem (or equivalently, Szemeredi's 
theorem) can be formulated in the language of von Neumann algebras as fol- 
lows: given an integer k > 2, an abelian finite von Neumann algebra (M,t) 
with an automorphism a : A4 — > M, and a non- negative a £ M with 
r(a) > 0, one has liminf JV _ i . 00 ^ T.n=l R«r(aa"(o) . . . a< fc - 1 ) n (a)) > 0; a 
subsequent result of Host and Kra shows that this limit exists. In particular, 
Rer(aa n (a) . . . a {k -^ n {a)) > for all n in a set of positive density. 

From the von Neumann algebra perspective, it is thus natural to ask to 
what extent these results remain true when the abelian hypothesis is dropped. 
All three claims hold for k = 2, and we show in this paper that all three claims 
hold for all k when the von Neumann algebra is asymptotically abelian, and 
that the last two claims hold for k = 3 when the von Neumann algebra is 
ergodic. However, we show that the first claim can fail for k = 3 even with 
ergodicity, the second claim can fail for k > 4 even when assuming ergodicity, 
and the third claim can fail for k = 3 without ergodicity, or k > 5 and odd 
assuming ergodicity. The second claim remains open for non-ergodic systems 
with k = 3, and the third claim remains open for ergodic systems with k = 4. 
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1. Introduction 

1.1. Multiple recurrence. Let (X, X, /i) be a probability space, and let T : X — > 

A be a measure-preserving invertible transformation on X (i.e. T, T^ 1 are both 
measurable, and fj.(T(A)) = fJ,(A) for all measurable A). From the mean ergodic the- 
orem we know that for any / G L°°(X), the averaged 1 Yln=i f °T~ n converge in 
(say) L 2 (X) norm, which implies in particular that the averages 53 n =i fx /i(/2° 
T~ n ) d/j, converge for all /i,/2 € L°°(X). Furthermore, if /i = /2 = / is non- 
negative with positive mean J x f dfj, > 0, then the Poincare recurrence theorem 
implies that this latter limit is strictly positive. In particular, this implies that the 
mean J x f(f°T~ n ) d\i is positive for all natural numbers n in a set E C N of posi- 
tive (lower) density (which means that liminfjv^oo ^#{1 < n < N : n £ E} > 0). 

Thanks to a long effort starting with Furstenberg's groundbreaking new proof |15j 
of Szemeredi's theorem on arithmetic progressions [35) . it is now known that all of 
these single recurrence results extend to multiple recurrence: 

Theorem 1.1 (Abelian multiple recurrence). Let (X,X,fi) be a probability space, 
let k > 2 be an integer, and let T : X — > X be a measure-preserving invertible 
transformation. 

• ( Convergence in norm) For any fx, ... , fk—i G L°°(X), the averages 

1 N 

-^(/ 1 or»)...(/ M orN)») 

n=l 

converge in L 2 {X) norm as N — » oo. 

• (Weak convergence) For any f , /i, . . . , fk-1 € L°°{X), the averages 

N 

jjY. / fo(fi°T- n )---(fk-i°T-V°-V n )dfi 

n=l J X 

converge as N — > oo . 



The minus sign here is not of particular significance (other than to conform to some minor 
notational conventions) and can be ignored in the sequel if desired. 
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• (Recurrence on average) For any non-negative f £ L°°{X) with J x f dji > 
0, one has 

N 

(1) lhnmf - / f(f° T- n ) . . . (/ o T-^" 1 )™) dfi > 0. 

71=1-' X 

• (Recurrence on a dense set) For any non-negative f G L°°(X) with J x f dfi > 
0, one has 

(2) f f(foT- n )...(foT-^ n )d f i>c>0 

for some c > and all n in a set of natural numbers of positive lower 
density. 

We have called this result the "abclian" multiple recurrence theorem to emphasise 
the abelian nature of the algebra L°°(X). 

Remarks 1.1. Clearly, convergence in norm implies weak convergence; also, as the 
averages ([2| are bounded and non-negative, recurrence on average implies recur- 
rence on a dense set. Using the weak convergence result, the limit inferior in ([!]) 
can be replaced with a limit, but we have retained the limit inferior in order to 
keep the two claims logically independent of each other. 

As mentioned earlier, the k = 2 cases of Theorem |1.1| follow from classical ergodic 
theorems. Furstenberg 15 established recurrence on average (and hence recurrence 
on a dense set) for all k, and observed that this result was equivalent (by what is 
now known as the Furstenberg correspondence principle) to Szcmeredi's famous 
theorem |35j on arithmetic progressions, thus providing an important new proof 
of that theorem. Convergence in norm (and hence in mean) was established for 
k = 3 by Furstenberg [15], for k = 4 by Conze and Lesigne [5], [S], [TO] (assuming 
total ergodicity) and by Host and Kra [55] (in general), for k = 5 in some cases by 
Ziegler [40], and for all k by Host and Kra [23] (and subsequently also by Ziegler 
|41j). See [5S] for a survey of these results, and their relation to other topics such 
as dynamics of nilsequences, and arithmetic progressions in number-theoretic sets 
such as the primes. < 



There is also a multidimensional generalisation of the above results to multiple 
commuting shifts: 

Theorem 1.2 (Abelian multidimensional multiple recurrence). Let (X,X,fi) be 
a probability space, let k > 2 be an integer, and let T , . . . ,Tfc_! : X — > X be a 
commuting system of measure-preserving invertible transformations. 

• ( Convergence in norm) For any f±, . . . , fk-i € L°°(X), the averages 

1 N 

X! T o"((/i ° T i n ) ■ ■ ■ (fk-l ° 

71=1 

converge in L 2 (X) norm. 

• (Weak convergence) For any f , f ll . . . , fk-i € L°°(X), the averages 

N 

/ (/° ° IT") ■ • • (fk-i o i£» ) dn 

n=l Jx 
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converge. 

(Recurrence on average) For any non-negative f £ L°°{X) with J„ / dfi > 
0, one has 



N 

(3) liminf - J2 / (/ ° To n )(f ° ?T) . . . (/ o T^) d» > 0. 

n=l JX 

• (Recurrence on a dense set) For any non-negative f £ L°°(X) with J x f dfi > 
0, one has 

(4) / (/ o o Tf ") . . . (/ o d/x > c > 

/or some c > and all n in a set of natural numbers of positive lower 
density. 



Of course, Theorem |l.l| is the special case of Theorem 1.2 when :— T l . It is often 



customary to normalise To to be the identity transformation (by replacing each of 
the Ti with T^Ti). 

Remarks 1.2. The k — 2 case is again classical. Recurrence on average (and 
hence on a dense set) in this theorem was established for all k by Furstenberg 
and Katznelson |16j . which by the Furstenberg correspondence principle implies a 
multidimensional version of Szemeredi's theorem, a combinatorial proof of which 
in full generality has only been obtained relatively recently in 30J and 20J . Con- 
vergence in norm (and weak convergence) was established for k — 3 in [5] , for some 
special cases of k — 4 in [35], for all k assuming total ergodicity in [13] . and for all 
k unconditionally in [36] (with subsequent proofs at [37], [I], [2]). The results can 
fail if the shifts To,---iTt-i do not commute [5]. Note that non-commutativity 
of the shifts should not be confused with the non-commutativity of the underlying 
algebra, which is the focus of this current paper. < 



1.2. Non-commutative analogues. From the perspective of the theory of von 
Neumann algebras, the space L°°(X) appearing in the above theorems can be 
interpreted as an abelian von Neumann algebra, with a finite trace r(/) := J x f d/j, 
and with an automorphism T : L°°(X) -> L°°(X) defined by Tf := f o T _1 . It 
is then natural to ask whether the above results can be extended to non-abelian 
settings. More precisely, we recall the following definitions. 

Definition 1.3 (Non-commutative systems). A finite von Neumann algebra is 
a pair (A4,t), where M is a von Neumann algebra (i.e. an algebra of bounded 
operators on a separably complex Hilbert space that contains the identity 1. is 
closed under adjoints, and is closed in the weak operator topology), and r : A4 — > C 
is a finite faithful trace (i.e. a linear map with r(a*) — r(a), r(ab) — r(ba), and 
r(a*a) > for all a, 6 £ M., with r(a*a) — if and only if a = and r(l) = 1). 
The operator norm of an element a £ M. is denoted ||a|j. We say that an element 
a £ M is non-negative if one has a = b*b for some b £ M.. An element a £ M. is 



In our applications, the hypothesis of separability can be omitted, since one can always pass 
to the separable subalgebra generated by a finite collection ao, . . . ,a.k—l of elements and their 
shifts if desired. 
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central if one has ab — ba for all b G At. The set of all central elements is denoted 
Z(Ai) and referred to as the centre of Ai; the algebra At is abelian if Z(Ai) = At. 

A shift a on a finite von Neumann algebra (At, t) is trace-preserving * -automorphism, 
i.e. a is an algebra isomorphism such that a(a*) = a(a)* and r(a(a)) = r(a) for all 
a € At. We say that the shift is ergodic if the invariant algebra {a G Ai : a(a) = a} 
consists only of the constants CI. We refer to the triple (Ai, t, a) as a von Neumann 
Z-system, or a von Neumann dynamical system. More generally, if a^, . . . , aifc_i 
are k commuting shifts on M , we refer to (Ai, T, ctQ, . . . , a^i) as a von Neumann 
Z fe -system. 

It is easy to verify that if (X,X,fi) is a (classical) probability space with a shift 
T : X — > X, then (L°°(X), J x ■ dfi, oT _1 ) is an (abelian example of a) von Neu- 
mann dynamical system, and more generally if To, ... , Tk-\ : X — > X are commut- 
ing shifts, then (L°°(X), J„ ■ d/i, oT^ 1 , . . . , oT^^) is an abelian example of a von 
Neumann Z fc -system. In fact, all abelian von Neumann dynamical systems arise (up 
to isomorphism of the algebras) as such examples; see Kadison and Ringrose [251 
Chapter 5]. 

A finite von Neumann algebra (AA, t) gives rise to an inner product (a, b) := r(a*6) 
on A4; the properties of the trace ensure that this inner product is positive definite. 
(We use the convention for a scalar product to be conjugate linear in the first 
coordinate.) The Hilbert space completion of M. with respect to this inner product 
will be referred to as T 2 (r). Note that a extends to a unitary transformation on 
T 2 (r). In the abelian case when M. = L°°(X, X, n), then T 2 (r) can be canonically 
identified with L 2 (X, X,fi). 

Inspired by Theorems |1.2[ we now make the following definitions: 

Definition 1.4 (Non-commutative recurrence and convergence). Let k > 2 be an 
integer, (A4,t, a) be a von Neumann dynamical system, and (Ad, r, «0j • • • > «fc-l) 
be a von Neumann Z k -system. 

• We say that (At, r, a) enjoys order k convergence in norm if for any ax, ... , ak-i G 
At, the averages 

1 N 

- ^K( fll ))(a 2 "(a 2 )) . . . (a^-^Ca*-!)) 

n=l 

converge in L 2 (t) as N — > oo. 

• We say that (At, r, a) enjoys order k weak convergence if for any do, Oi, . . . , a^-i G 
At, the averages 

1 N 

- £ T(a (a n (a 1 ))(a 2 "(a 2 )) . . . (a^ n (a k ^))) 

n=l 

converge as N — > oo. 

• We say that (Ai,T,a) enjoys order k recurrence on average if for any non- 
negative a € At with r(a) > one has 

1 N 

(5) liminf-V Rc T(a(a n (a))(a 2n (a)) ... (a {k - 1)n (a))) > 0. 

iV— J-OO iv * — ' 

n=l 
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• We say that (Ai , r, a) enjoys order k recurrence on a dense set if for any 
non-negative a € Ai with r(a) > one has 

(6) Rer(a(a™(a))(a 2n (a)) . . . (a (fc - 1)n (a))) > c> 0. 

for some c > and all n in a set of natural numbers of positive lower 
density. 

• We say that (Ai,T, atQ, . . . , afc_i) enjoys convergence in norm if for any 
ai, . . . , dfe-i € .M, i/ie averages 

1 W 

- £ «o "(K(«i))K(« 2 )) • • ■ «_iK-i))) 

n=l 

converge in L 2 [t) as N — > oo. 

• We saj/ iftai (.M, r, ao, • ■ • > a k-i) enjoys weak convergence if for any a , ai, . . . , a^-i € 
.M, i/ie averages 

1 W 

- ^ r(«(a ))K(ai))K(a 2 )) . . . (aJ_ 1 (o k _i))) 

n=l 

converge as N — > oo. 

• We say that (M,t,c<o, . . . , ctk—x) enjoys recurrence on average if for any 
non-negative a £ Ai with r(a) > one has 

1 N 

(7) liminf - Rer(«(a))K(a)) . . . «_i(<*))) > 0. 

AT— foo 1\ * — ' 

n=l 

• We say that [Ai , r, a) enjoys order k recurrence on a dense set if for any 
non-negative a € Ai with r(a) > one has 

(8) Rer(K»)K(a)) . . . KU(a))) > c> 0. 

/or some c > and all n in a set of natural numbers of positive lower 
density. 

Remark 1.1. As before, we may normalise ao to be the identity. Of course, the first 
four properties here are nothing more than the specialisations of the last four to the 
case ai = a 1 for < i < k — 1. The real part is needed in ^ because 

there is no necessity for the traces here to be real-valued (the difficulty being that 
the product of two non-negative elements of a non-abelian von Neumann algebra 
need not remain non- negative) . In the case of one can omit the real part by 
taking averages from —N to N, since one has the symmetry 

T(a(a n (a))(a 2n (a)) . . . (q(*-i)»((i))) = T((a(a n (a))(a 2n (a)) . . . (a (fc - 1)n (a)))*) 

= r((a( fc - 1 )"(a))...(a 2 "( a ))K(a)) a ) 

= r(a(a- n (a))...(a-^ n (a))) 

for any self-adjoint a. 

Note however that it is quite possible for the expressions Q to be negative 
even when a is non-negative. Because of this, while recurrence on average still 
implies recurrence on a dense set, the converse is not true; one can have recurrence 
on a dense set but end up with a zero or even negative average due to the presence 
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of large negative values of (K3]) or pL We will see examples of this later in this 



Remark 1.2. As mentioned earlier, the Furstenberg correspondence principle equates 
recurrence results with a combinatorial statements (such as Szemeredi's theorem) 
which can be formulated in a purely finitary fashion. However, we do not know 
whether the same is true for non-commutative recurrence results. Formulating a 
finitary statement that would imply recurrence results for some non-abelian von 
Neumann dynamical system probably requires some quite strong approximate em- 
beddability of the system into finite-dimensional matrix algebras with approximate 
shifts, together with a recurrence assertion for such finite-dimensional systems in 
which the various parameters may all be chosen independent of the dimension. 
Since many of the results we prove below in the infinitary setting are negative 
anyway, we will not pursue this issue here. < 

The study of these properties (and related topics) for von Neumann dynamical 
systems has been pursued by Niculescu, Stroh and Zsido [3T], Duvenhage [TT] . 
Beyers, Duvenhage and Stroh [5], and Fidaleo [T3]. A variant of these questions, 
in which one averages over a higher-dimensional range of shifts, was also studied in 
|12) . In this paper we shall develop further positive and negative results regarding 
these properties, which we now present. 

1.3. Positive results. We first remark that when k = 2, all systems enjoy norm 
and weak convergence, as well as recurrence on average and on a dense set, thanks 
to the ergodic theorem for von Neumann algebras (see e.g. [29l Section 9.1]). In- 
deed, from that theorem, we know that for any von Neumann dynamical system 
(M,t, a) and a € M, the averages ^ J2n=i a "( a ) converge in L 2 (t) to the orthog- 
onal projection of a to the invariant space I/ 2 (r) Q := {/ G L 2 (t) : a(f) — /}, giving 
the convergence results. If a is non-negative and non-zero, this projection can be 
verified to have a positive inner product with a, giving the recurrence results. 

Now we consider the cases k > 3. We have already seen from Theorems |1.1| |1.2| 
that we have convergence and recurrence in those abelian systems arising from 
ergodic theory, and have recalled above that in fact these include all examples (up 
to isomorphism). 

Proposition 1.5. Let k > 2. If (A4,T,a) is an abelian von Neumann dynami- 
cal system, then (A4 , r, a) enjoys weak convergence and convergence in norm, and 
recurrence on average and on a dense set. 

More generally, if {M., r, ao, ■ • ■ , &k—i) * s an abelian von Neumann 1, , -system, then 
this Z k -system enjoys weak convergence and convergence in norm, and recurrence 
on average and on a dense set. 

We now generalise these results to the wider class of asymptotically abelian systems. 

Definition 1.6 (Asymptotic abelianness). A von Neumann dynamical system (A4, r, a) 
is asymptotically abelian if one has 



paper. 




n=l 
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for all a,b £ M., where [a, b] := ab — ba is the commutator. 

Remark 1.3. In previous literature such as [BJ, a stronger version of asymptotic 
abelianness is assumed, in which the L 2 (t) norm is replaced by the operator norm. 
Variants of this type of "topological asymptotic abelianness" , and their relationship 
with non-commutative topological weak mixing have also been considered in [27] . < 



Our work also singles out this case as special, since the assumption of asymptotic 
abelianness seems to be essential for the correct working of some the chief tech- 
nical tools taken from the commutative setting (particularly the van der Corput 
estimate). In the previous works |31) . [6J, convergence and recurrence were 
shown for all orders k for asymptotically abelian systems under some additional 
assumptions such as weak mixing or compactness. Our first main result shows that 
in fact all asymptotically abelian systems enjoy convergence and recurrence. 

Theorem 1.7. Let k > 2. If (A4,T,a) is an asymptotically abelian von Neu- 
mann dynamical system, then (M , r, a) enjoys weak convergence and convergence 
in norm, and recurrence on average and on a dense set. 

More generally, if(A4, r, ao, . . . , ctk—i) is a von Neumann Z k -system, and the otiaj 1 
for i j are each individually asymptotically abelian, then this Z k -system enjoys 
weak convergence and convergence in norm, and recurrence on average and on a 
dense set. 



Theorem 1.7 is deduced from the genuinely abelian case (Proposition 1.5 1 using two 



results. The first is essentially from [6J or [IT], which considered the model case 
a.i = a 1 ; for the sake of completeness, we present a proof in Appendix |a| 

Theorem 1.8 (Multiple ergodic averages for relatively weakly mixing extensions). 
Let (Ai, t, ao, • • • , &k-i) be a von Neumann Z k -system, and let M be a von Neu- 
mann subalgebra of M which is invariant under all of the on . If for any distinct 
< i, j < k — 1 the shift aiaj 1 is asymptotically abelian and weakly mixing relative 
to Af, then the associated multiple ergodic averages satisfy 



N fe-1 N fe-1 

I jf e a o n n - n £ a ° n n «?(3v(ai)) 



n=l 



i=l 







as N — > oo, where E_\f : M — > J\f is the conditional expectation constructed from t, 
and the products are from left to right. 



We will recall the notions of relative weak mixing and conditional expectation in 
Section [3l 

The second result, which is new and may have other applications elsewhere, can be 
viewed as a partial analogue of the Furstenberg-Zimmer structure theorem [17] for 
asymptotically abelian systems. 

Theorem 1.9 (Structure theorem for asymptotically abelian systems). If(M, t, a) 
is an asymptotically abelian von Neumann dynamical system, then a is weakly mix- 
ing relative to the centre Z(M) C M.. 
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Remark 1.4. In the case when M. is a factor (i.e. the centre is trivial), results of this 
nature (with a slightly different notion of mixing, and of asymptotic abelianness) 
was established in Example 4.3.24]. 



These results quickly imply Theorem |1.7| Indeed, when st udyi ng (for instance) 



convergence in norm for a Z fe -system, one can use Theorem 
1.81 to replace each of the a , 



1.9 



followed by The- 



E 



Z(M) 



,E 



Z(M) 



(Ofc- 



one can apply Proposition 1.5 



, cbk—i by their conditional expectations 
i) without any affect on the convergence, at which point 
(Note that the centre Z(M) does not depend on 



3.1 



what shift a.J otj one is analysing.) The other claims are similar (using Lemma 
to ensure that if a is non-negative with positive trace, then so is the conditional 
expectation Ez(M)( a ))- 

Remark 1.5. The above arguments in fact show a more quantitative statement: if 
a is non- negative with ||a|| < 1 and r(a) > 5 for some < 8 < 1, then one has 
the same lower bound c(fc, 5) > for (6| as is given by Szemeredi's theorem for ([I]) 
for non-negative functions / with \\f \l°°(x) < 1 an d J x fdfJ>>8 (in particular, 
one could insert the bound of Gowers |19j). Similar remarks apply to multiple 
commuting shifts. We leave the details to the reader. < 



The proof of Theorem |1.9[ given in Section [3] below, rests on non-commutative 
versions of several of the steps on the way to the Furstenberg-Zimmer Structure 
Theorem in the commutative world of ergodic theory [15, 43, 42j. In particular, it 
rests on a version of the dichotomy between relatively weakly mixing inclusions and 
those containing a relatively isometric subinclusion, well-known in ergodic theory 
from the work of Furstenberg [15] and Zimmer [431 H2] and already generalized 
to the non-commutative world by Popa in [32] . for applications to the study of 
superrigidity phenomena. 

If (M , r, a) is not asymptotically abelian then matters are rather more complicated, 
with positive results only obtaining under additional restrictions. For k = 3 and 
for ergodic shifts, we have a positive result, established in Section [5] 

Theorem 1.10. Ifk — 3 and (M.,T,a) is an ergodic von Neumann dynamical sys- 
tem, then one has weak convergence and convergence in norm, as well as recurrence 
on a dense set. 



We remark that the weak convergence result was previously established in |13j . 



I. 4. Negative results. Recurrence on average has been omitted from Theorem 

II. 101 This is because this result fails: 

Theorem 1.11. Let k = Z, then there exists an ergodic von Neumann dynamical 
system (JA,T,a) for which recurrence on average fails. (In fact one can make the 
average Q strictly negative.) 

We establish this in Section |2.2[ The main tool is a sophisticated version of the 
Bchrend set construction, combined with the crossed product construction. 
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When one drops the ergodicity assumption^ one also loses recurrence on a dense 
set: 

Theorem 1.12. Let k = 3, then there exists a von Neumann dynamical system 
(M,T,a) for which recurrence on a dense set fails. (In fact one can make the 
means (|6| equal to a negative constant for all non-zero n.) 



We establish this in Section |2.2| also. This result is simpler to prove than The- 
orem and uses the original Behrend set construction, and crossed product 
constructions. 

One also loses recurrence on a dense set for larger k even when ergodicity is assumed: 

Theorem 1.13. Let k > 5 be odd, then there exists an ergodic von Neumann 
dynamical system (M,t, a) for which recurrence on a dense set fails. (In fact one 
can make the means ^ equal to a negative constant for all non-zero n.) 



We establish this in Section |2.3| This result uses a counterexample of Bergelson, 
Host, Kra, and Ruzsa [3], combined with a group theoretic construction. The re- 
striction to odd k is mostly technical and can almost certainly be removed; however, 



we are unable to decide whether Theorem 1.13 can be extended to the k = 4 case, 
because it was shown in [4] that the k = 5 counterexample in that paper cannot be 
replicated for k = 4. 

For convergence, we have counterexamples for k > 4 even when assuming ergodicity: 

Theorem 1.14. Let k > 4, then there exists an ergodic von Neumann dynamical 
system (M,t, a) for which weak convergence and convergence in norm fail. 

We establish this in Section |2.1| The main tool is a group theoretic construction. 



The above counterexamples were for the single shift case, but of course they are 
also counterexamples to the more general situation of multiple commuting shifts. 
We summarise the positive and negative results (in the single shift case) in Table 

HI 

We note in particular that the following questions remain open: 

Problem 1.15. Ifk = 3, does weak or norm convergence hold for non-ergodic von 
Neumann dynamical systems (M,t, a) ? 

Problem 1.16. Ifk — 3, does weak or norm convergence hold for von Neumann Z 3 - 
systems {Ai, r, ao, a±, a^) (possibly after imposing suitable ergodicity hypotheses)? 

Problem 1.17. If k = A (or if k > 6 is even), does recurrence on a dense set hold 
for ergodic von Neumann dynamical systems {M.,T,a)? 

We present some remarks on the first two problems in Section [6j 

3 In the commutative case, an easy application of the ergodic decomposition allows one to 
recover the non-ergodic case of the recurrence and convergence results from the ergodic case. 
Unfortunately, in the non-commutative case, the ergodic decomposition is only available when the 
invariant factor Ai T is central, which is the case in the asymptotically abelian case, but not in 
general. 
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Table 1 . Positive and negative results for non-commutative con- 
vergence and recurrence of a single shift for various values of k, and 
for various assumptions of ergodicity. The entries marked "No?" 
would be expected to have a negative answer if one adopts the prin- 
ciple that recurrence results which fail for one value of k, should 
also fail for higher values of k. 





Conv. norm? 


Conv. mean? 


Recur, avg.? 


Recur, dense? 


k = 2 


Yes 


Yes 


Yes 


Yes 


k — 3, erg. 


Yes 


Yes 


No 


Yes 


k — 3, non-erg. 


??? 


??? 


No 


No 


k > 4, even, erg. 


No 


No 


No? 


??? 


k > 4, even, non-erg. 


No 


No 


No? 


No? 


k > 5, odd, erg. 


No 


No 


No 


No 


k > 5, odd, non-erg. 


No 


No 


No 


No 



Notational remark. Unfortunately this paper stands between two quite unre- 
lated uses of the word 'factor', one from operator algebras and one from ergodic 
theory. In the hope that it may be of interest to operator algebraists, we have 
deferred to their usage (even though the true notion of a factor due to Murray and 
von Neumann is actually not essential to our work), and will refer throughout to 
inclusions of von Neumann algebras, even in the commutative setting where these 
can be identified with ergodic-theoretic 'factors'. < 



Acknowledgements. Our thanks go to Sorin Popa for several helpful discussions, 
Francesco Fidaleo and David Kerr for references, and to Ezra Getzler for explain- 
ing Grothendieck's interpretation of a group via its sheaf of flat connections. The 
authors are indebted to the anonymous referee for careful comments and sugges- 
tions. Brown University and Universitat Tubingen and University of California, 
Los Angeles. 



2. Counterexamples 



In this section we construct various counterexamples of von Neumann systems 
(A4 , r, a) which will demonstrate the negative results in Theorems 1.11|1.14 The 
material in this section is independent of the positive results in the rest of the 
paper, but may provide some cautionary intuition to keep in mind when reading 
the proofs of those results. 



2.1. Non-convergence for k > 4. We first show that convergence results fail for 
k > 4, even if one assumes ergodicity. In fact the divergence is so bad that it is 
essentially arbitrary: 

Theorem 2.1 (No convergence for k > 4). Let k > 4 be an integer, and let A C Z 
be a set. Then there exist an ergodic von Neumann system {M.,r, a) and elements 
do, . . . , ctfc_i e M. such that 

T(a a n (a 1 )...a^ n (a k _ 1 )) = l A (n) 
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for all integers n. 



It is clear that this implies Theorem 1.14 by choosing A appropriately (and noting 
that failure of weak convergence implies failure of convergence in norm, by Cauchy- 
Schwarz applied in the contrapositive). 



Proof. It will suffice to verify the k the higher cases follow by setting 

dj = 1 for j > 4. We will need a group G with four distinguished elements 
eo,ei,e2,e3 and an automorphism T : G — > G such that T k has no fixed points 
other than the identity for all k ^ 0, and such that 

e (T r e 1 )(T 2r e 2 )(T 3r e 3 )=id 

holds for all r € A and fails for all r e Z\A. The construction of such a group is 
somewhat non-trivial and is deferred to Appendix [B] and in particular to Proposi- 
tion nr 



The group algebra CG of formal finite linear combinations of group elements of G, 
acts (on the left) on the Hilbert space £ 2 (G) in the obvious manner (arising from 
convolution on G), and can thus be viewed as a subspace of the von Neumann alge- 
bra B(£ 2 (G)) (note that all the elements of G become unitary in this perspective). 
We can place a finite faithful trace r on CG by declaring the identity element to 
have trace 1, and all other elements of G to have trace zero. If we then define M. 
to be the closure of CG in the weak operator topology of B(£ 2 (G)), we obtain a 
finite von Neumann algebra, known as the group von Neumann algebra LG of G. 
The shift T leads to an algebra isomorphism a of CG, which then easily extends to 
a shift a on Ai = LG. Because none of the powers of T have any non-trivial fixed 
points, the orbit of any non-zero group element contains no repetitions, and so one 
can easily establish that a n f converges weakly to t(/) as n — > oo for every / £ CG, 
and hence by approximation that the unitary operator on £ 2 {G) associated to a 
has no fixed points outside C5id- This implies that (Ai, r, a) is ergodic, since given 
a e Ai for which a(a) = a and r(a) = it follows that a(<5id) <E £ 2 (G) is a fixed 
point for the action of T on £ 2 (G), which must therefore equal T(a)^id = 0, and 
hence r(a*a) = ||a(^id)||| = and so a = 0, by the faithfulness of t. If we now set 
dj = ej for j = 0, 1, 2, 3 we obtain the claim. □ 



Remark 2.1. An inspection of the proofs of Proposition 2.1 and Proposition B.8 
shows that the expression aoa n (ai)a 2n (ci2)a 3 ™ (03) can more generally be replaced 
by a c ° n (a )a Cin (ai)a C2n (a2)a C3n (a 3 ) whenever c ,Ci,c 2 ,c 3 are integers with Cj 7^ 
Cj-|_i for all i = 0,1,2, 3 (with the cyclic convention c.; + 4 = Cj). Thus for instance 
one can construct von Neumann systems for which 

r(a (a n (ai))a 2 a™(a 3 )) = l A (n) 

for an arbitrary set A. We omit the details. < 

Remark 2.2. The examples of non-convergence given above are not self-adjoint 
or positive, and the are not equal to each other. However, it is not hard to 
modify the examples to give an example of a positive cij = a for which the averages 
Ar E„=i T(aa n (a)a 2n (a)a 3n (a)) do not converge. Indeed, one can repeat the above 
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x+2k+h 



x+2h+2k 



x+k 




x+2h+k 



x x+h 

Figure 1. A hexagon. Note the absence of arithmetic progres- 
sions of length three. 



construction with 



1 



3 



i=0 

this is easily seen to be positive and self-adjoint, and a modification of the above 
computations then shows that 

r(aa»a 2 »a 3 ») = 1 + ~M™) 

for all n, which is enough to ensure divergence by choosing A appropriately. We 
leave the details to the reader. < 

Remark 2.3. The group G constructed here can easily be shown to have infinite 
conjugacy classes (by the same methods used to prove Proposition B.8). This 



implies that the group algebra LG is a factor. We refer to Kadison, Ringrose [26l 
Theorem 6.7.5] for details. < 

2.2. Negative averages for k = 3. We now show the negativity of various triple 
averages. The main tool is the following Bchrend-typc construction of a set which 
avoids progressions of length three, but contains many "hexagons" : 

Lemma 2.2 (Bchrend-type example). Let e > 0. Then for all sufficiently large d, 
there exists a subset F o/Z/dZ such that \F\ > d}~ £ , but F contains no non-trivial 
arithmetic progressions of length three, thus n,n + r,n + 2r G F can only occur if 
r = 0. On the other hand, the set 

{(x, h, k) G Z/dZ : x,x + h, x + k,x + k + 2h, x + 2k + h, x + 2k + 2h G F} 

of "hexagons" in F has cardinality at least d 3 ~ £ . 

We remark that the first part of the lemma already follows directly from the work 
of Behrend [2 or the earlier work of Salem and Spencer (33] . The claim about 



hexagons will be needed in the proof of Theorem 2.6 below, but is not needed for 



the simpler results in Corollary 2.4 or Theorem 2.5 



Proof. Let R be a large multiple of 400 (depending on e). We claim that for n a 
large enough multiple of 4 (depending on R), the set {— R, . . . , R} n C Z™ contains 
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a subset E of cardinality \E\ > e _ ° < - n - ) i? Tl (where the implied constant in the 0() 
notation is absolute), and which contains > e~°^R 3n hexagons {x,x + h,x + 
k, x + k + 2h, x + 2k + h, x + 2k + 2h} but contains no arithmetic progressions of 
length three. Choosing d sufficiently large, letting n be the largest integer such that 
(10R) n < d and then embedding {— R, . . . , R} n in Z/dZ using base 10R (say), as in 
the work of Behrend or Salem-Spencer, this claim will imply the lemma (choosing 
R sufficiently large depending on e). 

It remains to establish the claim. From the classical results on the Waring problem 
(see e.g. [38]), we know that every large integer N has ~ ]\[( k - 2 )/ 2 representations 
as the sum of k squares for k large enough (one can for instance take k = 5, 
but for our purposes any fixed k will suffice). Using this, we see that for any 
fixed S £ (0, ^j), every integer r such that SR 2 n < r < joR 2 n (say) will have 
> (cgR) n s representations as the sum of n squares of integers less than R, where 
eg, Cg > depend only on 5. In other words, the sphere E r :— {x £ {—R, ■ R} n : 
\x\ 2 = r} has cardinality at least (cgR) n ~ Cs . On the other hand, such spheres have 
no non-trivial progressions of length three. Thus it will suffice (for n large enough) 
by the pigeonhole principle to show that there are at least e -°( n )ji 3n hexagons 
{x,x + h,x + k,x + k + 2h,x + 2k + h,x + 2k + 2h} in {-R, R} n such that 

(9) \x\ 2 = \x+h\ 2 = |a;+fc| 2 = \x+k+2h\ 2 = \x+2k+h\ 2 = \x+2k+2h\ 2 < ^R 2 n 

(note that the case when \x\ 2 < SR 2 n for sufficiently small 6 can be eliminated by 
crude estimates). 

To count the solutions to J9j) , we perform some elementary changes of variable to 
replace the constraints in (I9F with simpler constraints. We begin by observing that 
if a, b, c G {--R/100, . . . , i?/100}" are such that 

(10) a ■ b = b ■ c = c ■ a = 0; c • c = 36 • b 

then x := a — 2b, h := b + c, k :— b — c can be verified to be a solution to ([9]), with 
the map (a, 6, c) — > (x, h, k) being injective, so it suffices to show that there are at 
least e -°(™)/£ 3n triples (a, b, c) with the above properties. 

For reasons that will become clearer later, we will initially work in dimension n/4 
rather than n. Using the Waring problem results as before, we can find at least 
e -o(n) R 3n/i tripleg 0j 5 jC e {_i?/400, . . . ,i?/400}™/ 4 such that 

c • c = 36 • b. 



This is one of the four constraints required for (10). To obtain the remaining con- 
straints, we use a pigeonholing trick followed by a tensor power trick. Firstly, ob- 
serve that whenever a, 6, c e {-i?/400, . . . , i?/400}™ /4 , then a-6, b-c, c-a are of order 
0(R 2 n) < e°( n \ Applying the pigeonhole principle, one can thus find hi, hi, h% — 
0{R 2 n) such that there are e -o(.n) R 3n/4 tripleg a bjCE {-i^/400, . . . ,i?/400}™/ 4 
with 

(11) a ■ b = h\, b-c^hi; c ■ a = h%; c ■ c — 36 • 6. 



This is an inhomogeneous version of (10) (at dimension n/4 rather than n), with 



the zero coefficients replaced by more general coefficients h\,hi,h^. To eliminate 
these coefficients we use a tensor power trick. Let S C {— i?/400, . . . , i?/400}"/ 4 x 
{-i?/400, . . . , i?/400}™/ 4 x {-E/400, . . . , i?/400}"/ 4 be the set of all triples (a, b, c) 
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obeying (11). We then observe that if {a^b^Ci) & S for i — 1,2,3,4, then the 



vectors a, b, c e Z" defined by 

a ■= (ai,a 2 ,a 3 ,a 4 ); b := (bi, b 2 , -b 3 , -6 4 ); c := (c 1 ,-c 2 ,c 3 ,-c 4 ) 



solve (10). The map from the (dj,6i,Cj) to (a, b, c) is an injection from S to the 



solution set of (10), and so we obtain at least l^ 4 > e °( n )jl 3n solutions to (10) 



as required. □ 
This leads to a useful matrix counterexample: 

Lemma 2.3 (Restricted third moment can be negative). There exists a positive 
semi-definite Hermitian matrix (A(j, k))\<j^<d for which the quantity 



(12) V" A(n,n + r)A(n + r,n + 2r)A(n + 2r,n) 

is negative, where we extend A(i,j) periodically in both variables by d. 

Proof. We will take d to be a multiple of 3, and A(j, k) to take the form 

A(j, k) ■= l E (j)l E (k) + l E (j)u-n E (k)w k 

where E C Z/cK is a set to be determined later, and ui :— e 2 ™/ 3 is a cube root of 
unity. The matrix (A(j, k))i<j t k<d is then the sum of two rank one projections and 



is thus positive semi-definite and Hermitian. The expression ( 12 ) can be expanded 
as 

]T (i + ^)(i + ^)(i + c- 2r ). 

n,r£Z/dl,:n,n+r,n+2r£ E 

The summand can be computed to equal 8 when r is divisible by 3, and —1 other- 
wise. Thus, to establish the claim, it suffices to find a set E such that the set 

{(n, r) € Ijdl :n,n + r,n + 2reE;r^0 mod 3} 

is more than eight times larger than the set 

{(n, r) € Z/dZ : n, n + r, n + 2r £ E; r = mod 3}, 

thus the length three arithmetic progressions in E with spacing not divisible by 3 
need to overwhelm the length three progressions with spacing divisible by 3. 



To do this, we use Lemma 2.2 to obtain a subset F C {1, . . . , [d/10]} of cardinality 
\F\ > d°" which contains no arithmetic progressions of length three. We then pick 
three random shifts /io,/ii,/i2 G {1, - - - , uniformly at random, and consider 
the set 

E:={3(f + hi) +i:i = 0,1,2; f£F] 
consisting of three randomly shifted, dilated copies of F. 

By construction, the only length three progressions in E with spacing divisible 
by 3 are the trivial progressions n, n, n with r = 0, so the total number of such 
progressions is at most d. On the other hand, for any fixed /c/1,/2 G F, the 
numbers 3(/i + hi) + i for i = 0, 1, 2 have a probability 3/d of forming an arithmetic 
progression with spacing not divisible by 3, due to the random nature of the hi. 
Thus the expected value of the total number of such progressions is at least (c? 99 ) 3 x 
3/d = 3d 1,97 . For d large enough, this gives the claim. □ 
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This already gives a simple example of negative averages for non-ergodic systems: 

Corollary 2.4 (Negative average for non-ergodic system). There exists a finite 
von Neumann algebra (M.,t) with a shift a, and a non-negative element a £ M, 
such that 9jv 1 +1 ^2 n= _ N T~(aa n (a)a 2n (a)) converges to a negative number. 



We let M. be the von Neumann 
ised trace r, and with the shift 



Proof. Let a = (A(j, k))\<j^<d be as in Lemma 2.3 
algebra of complex d x d matrices with the norma 

a(B(j,fc))i<j,*<«i := (e^-W^O'. fc))i<i,fc<d- 
This is easily verified to be a shift. We see that 

T{aa n {a)a 2n {a)) = - d £ e 2 ' in ^ +l - 2 ^ d A(j,k)A(k,l)A(l,j) 
j,k,iez/dz 

This expression is periodic in n with period d, and has average 
i 51 i4(M + r)A(J + r,J + 2r)A(J + 2r,0 



and the claim then follows from Lemma 12. 3 



□ 



This shows that recurrence on average for k — 3 can fail for non-ergodic systems. 
However, this is not yet enough to establish either Theorem |1.11| or Theorem |1.12| 
To obtain these stronger results we must introduce the crossed product construction 
in von Neumann algebras. For a comprehensive introduction to this concept, see 
PS1 Chapter 13]. We shall just recall the key properties of this construction we 
need here. 

Suppose we have a finite von Neumann algebra (M , r) , and an action U of a 
(discrete) group G on M., thus for each g £ G we have a shift U(g) : M. — > A4 such 
that U(g)U(h) = U(gh) for all g,h £ G, with [/(id) being the identity. Then there 
exists a crossed product (A4 xi u G, r) which contains both the original space (.M , r) 
and the group algebra CG as subalgebras. Furthermore, in this crossed product we 
have 

(13) U(g)a = gag" 1 

for all a € and g £ G, and 

r(go) = r(ag) = 

for all a G jVl and g E G with g not equal to the identity. Finally, the span of the 
elements ag for a £ M. and g £ G is dense in A4 Xjj G. 

Remark 2.4. The exact construction of the crossed product is not relevant for our 
applications, but for the convenience of the reader we sketch one such construction 
here. We first form the Hilbert space 

t,:=^(G,i 2 (r)) = 0L 2 (r) 

geG 

consisting of tuples (x g ) g& G m L 2 (t). This space has an action of M. defined by 

a(x g ) gl = G := {{U{g- 1 )a)x g ) geG 
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for a £ A4, and an action of G (and hence CG) defined by 

H x g)geG ■= {x h -i g ) geG . 

One can verify that these actions combine to an action of the twisted convolu- 
tion algebra f 1 (G,A^) on t), defined as the space of formal sums J2hGG^ ah with 



Sh,eG ll a frll < 00 j an( i subject to the relations (13 1. We define a trace on such sums 
by the formula T~(^2 heG hah) ■— T(o-id)- One can then show that one can extend 
this to a finite trace on the weak operator topology closure of £ 1 (G, M), viewed 
as a subset of B(t)); this closure can then be denoted A4 G. In other words, 
M. xi u G is constructed as the von Neumann algebra generated by the action of M 
and G on [). < 

Example 2.5. The group von Neumann algebra LG can be viewed as C x G, where 
G acts trivially on the one-dimensional von Neumann algebra C. < 

We can now get a stronger version of Corollary |2.4| 

Theorem 2.5 (Negative trace for non-ergodic system). There exists a von Neu- 
mann dynamical system (At,T,a) and a non-negative element a G M, such that 
r(aa n (a)a 2n (a)) is negative (and independent of n) for all non-zero n. In particu- 
lar, Theorem \1.1S\ holds. 

Proof. Let (A4',T,f3) be a von Neumann dynamical system to be chosen later. 
Using the crossed product construction, we can build an extension A4 := M! "Ajjl? 
of M! generated by M! and two commuting unitary elements u, m, such that 

(14) mom -1 = (3(a) 

and 

uau^ 1 = a 

for all a S M 1 . In particular, the element u is central. It is then easy to see that 
we can builcd a shift QonM for which 

a(a) = a; a(u) — u; a(m) = mu 

for all a € M.' , since the action of the group 1? generated by m and u on M' is 
unchanged when one replaces m by mu. 

Now let a £ Ai be an element of the form 



where fi G .M', and only finitely many of the ft are non-zero. This is clearly 
non-negative, and can be simplified by (fl4|) to the power series 



hez 



To build a explicitly, we can view M as an algebra of operators on the Hilbert space f) : = 



tt?(j fc)gz2 ^ 2 ( r ) as P er Remark 2.4 and let a be the conjugation a i-> WaW* by the unitary 
operator W : f) -> () denned by WT^j.fc))^*.)^ := ( a; 0-,fc-j))(j,fe)6Z2 . 
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where the gh £ M! are the twisted autocorrelations of the fj, 

Let n be non-zero. The expression r(aa™ (a)a 2n (a)) can be expanded as 

The net power of the central element u here is n(h 2 +2hz), and the net power of m is 
hi+h 2 + h 3 . Thus we see that the trace vanishes unless h 2 + 2h 3 = hi + h 2 + h 3 = 0, 
or equivalently if (hi, h 2 , /13) = (h, —2h, h) for some h. Performing this substitution 
and using (14 1, we simplify this expression to 

(15) J2 T ^ h (9-2 h )r h (g h ))- 

In particular, this expression is now manifestly independent of n =/= 0. 

We now select M! to be the commutative von Neumann system L°°(Z/dZ) with 
the shift f3(f(x)) :— f[x + 1) and the normalised trace. Thus the gh and fh are 
now complex- valued functions on Z/dZ, and the above expression can be expanded 
explicitly as 

\ ^2 ^9h{x)g- 2h {x + h)g h (x-h). 

Meanwhile, the gh(x) by definition can be written as 

9h(x) = fj+h{x)fj(x + h). 

We pick a large number N to be chosen later, and set 

fj(x) := b(x,x + j)li<j<Nd 

where b : Z/dZ x Z/dZ — >• C is a function periodic in two variables of period d to 
be chosen later. Then we can compute 

9h{x) = (l-^\ NA{x, x + h) + 0(1) 



dN 



+ 



where 

(16) A(x,y):= £ b(x,z)b(y,z) 



and O(l) denotes a quantity that can depend on d (and b) but is uniformly bounded 
in N. The expression (p~5|) can then be computed to be 



iV 4 

C— £ A{x,x + h)A{x + h,x-h)A(x-h,x) + 0(N 3 ) 

where C > is the explicit constant 

C:= f (l-\h\f + (l-\2h\)+dh. 
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By the substitution x = m 

.iV 4 

T 



(17) 



c- 



E 

m.reZ/dZ 



r, h = r, we can re-express this as 
A(m, m + r)A(m + r,m + 2r)A(m + 2r, m) + 0(N 3 ). 



Now, let d and A(j, k) be as in Lemma 2.3 By the spectral theorem (which in 



particular allows one to construct self-adjoint square roots of positive definite ma- 
trices) , we can find b{x,y) so that ( 16 ) holds. The summand in ( 17 ) is then negative, 
and the claim follows by choosing N large enough depending on all other parame- 
ters. □ 



Of course, by Theorem 1.10 one cannot have such a result when the underlying 



shift a is ergodic. On the other hand, one can extend Corollary |2.4| to the ergodic 
case: 

Theorem 2.6. There exists an ergodic von Neumann system (A4,r,a) and a non- 
negative element a <G M, such that 2jV 1 +1 Yln=-N T ( aQ! "( a ) a2 "( a )) converges to a 
negative number. In particular, Theorem \1.11\ holds. 

Proof. Let d be a large odd number, and let u := e 2m / d be a primitive d th root 
of unity. We will let M. be a completion of the non- commutative torus. This is 
obtained by first forming the C*-algebra generated by two unitary generators e±, e 2 
obeying the commutation relation 

e 2 ei = ue\e2 

and with all of the expressions e\e\ having zero trace unless j = k = 0, in which 
case the trace is 1; and then completing in the weak operator topology resulting 
from the Gel'fand-Naimark-Segal representation on L 2 (t). One can represent this 
finite von Neumann algebra more explicitly by letting ei,e2 act on L 2 ((M/Z) 2 ) by 
the maps eif(x,y) :— e 27rzx f(x,y) and e2f(x,y) :— e 27rzy f(x + l/d,y), with the 
trace r given by r(a) = (ft, a£l)L 2 ((n/z) 2 )> where 17 = 1 is the identity function on 
(R/Z) 2 . 

We let 61,62 € S 1 be generic unit phases, and then define the shift a on M. by 
setting 

a(ei) := #iei; a(e 2 ) := 6 2 e 2 . 

It is easy to see that this is a shift. If 6\, 62 are generic (so that 6\6\ is not a root 
of unity for any (j, k) ^ (0, 0)), this shift is easily verified to be ergodic (as one can 
verify the mean ergodic theorem by hand on the generators e\e\, and then argue 



as in the proof of Theorem 2.1 using the faithfulness of r). 



We set a := gg* , where g is an element of the form 



M 

k=i hez 



M is a large number (much larger than d) to be chosen later, and are complex 
numbers to be chosen later, all but finitely many of which are zero. Clearly a is 
non-negative. A computation shows that 



E C ^ e l e 2 
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where 

r.u ,. := M I 1 - 

M 



Ci +h CiU KL . 



Since 

a n {a) = J2 Ch^e^el 

h,k£Z 

some Fourier analysis and the genericity of 9±, 9 2 show that the expression 

N 



l — ; £ r(aa»a 2 ») 



2iV , 

n=-JV 



converges as N — ¥ 00 to the expression 

^C l ,fcC_2/ i ,-2feC/ l , fc T(e5 l e^e^ 2 ' l e 2 " 2 ' c e5 1 e^). 



h.k 



The trace here simplifies to u 3hk . Inserting (18), we can expand this expression as 
(19) M 3 0(fc/M)c /l+h c77c /2 _ 2 ^ Ci3+ ^ U feil - 2fei2 + fci3+3ftfc 

where 

</>(*) := (1-|*|£(1-|2*|)+. 
By Poisson summation, the expression 

k 

can be computed to be M J R 4>(x)dx + 0(1) if l\ — 2l 2 + h + 3h is divisible by d, 
and O(l) otherwise, where 0(1) denotes a quantity that can depend on d but is 
bounded uniformly in M. If we then assume that the Ch vanish for h outside of 
{1, . . . , M} and are bounded uniformly in M , we can thus expand ( 19 1 as 

CM 4 ^2 ci 1+h c[;ci 2 -2hcj^ci 3+ hcj^ + 0(M 7 ) 

h,h,l2,heZ: d\h-2l 2 +l3+3h 

for some absolute constant C > 0. 

If we now set Ch '■= b(h)hi t M]{h), where b : Z/dZ —¥ C is a periodic function with 
period d and independent of M to be chosen later, we can express this as 

C d M 8 J2 b(h+h)b^b{l 2 -2h)^)b(l 3 + h)^h) + 0(M 7 ) 

h,h,h,h&/dW,: h-2l 2 +h+3h=0 

for some Cd > depending on d but independent of M. Making the substitution 
h = x; I2 = x + k + 2h; I3 = x + 2k + h, we see that we will be done as soon as we 
are able to find d, b for which the expression 

X := ^2 Hx)b(x + h)b{x + k)b(x + k + 2h)b(x + 2k + h)b(x + 2k + 2h) 
is negative. 



To do this, we again appeal to Lemma 2.2 to find a set F C Z/dZ of size at least 
d°" (assuming d large enough) , which contains no arithmetic progressions of length 
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three, but contains at least d 2 " hexagons x, x + h, x + k, x + k + 2h, x + 2k + h,x + 
2k + 2h. We then set 

b(x) := e x l F {x) 

where the e x — ±1 are independent signs, thus X is now the random variable 



£-x(-x+h£x+k£-x+2h+k£x+h+2k£x+2h+2k- 
x,h,k:x,x+h,x+k,x+k+2h,x+2k+h,x+2k+2h£F 

We will show (for d large enough) that the standard deviation of X exceeds its 
expectation, which shows that there exists a choice of signs for which X is negative. 

We first compute the expectation of X. The only summands with non-zero ex- 
pectation occur when all the signs cancel, which only occurs when h = or when 
k = 0, as can be seen by an inspection of the number of ways to collapse the 
hexagon in Figure [I] here we need the hypothesis that d is odd. But as F contains 
no non-trivial arithmetic progressions, there are no summands for which only one 
of the h, k are zero, so we are left only with the h = k = terms, of which there 
are at most d. Thus the expectation of X is at most d. 

Now we compute the variance. There are at least d 2 " hexagons in F, and all but 
0(d 2 ) of them are non-degenerate in the sense that the six vertices of the hexagon 
are all distinct. The summands in X corresponding to non-degenerate hexagons 
have variance 1, and the correlation between any two summands in X either zero 
or positive (the latter occurs when two summands are permutations of each other). 
Thus the variance of X is 3> d 2 ", so the standard deviation is 3> d 1A95 , and the 
claim follows. □ 

2.3. Negative trace for k = 5. Now we show negative traces can occur even in 
the ergodic case when k = 5. 

Theorem 2.7. There exists an ergodic von Neumann dynamical system (A4,T,a) 
and a non-negative element a € Ai, such that T(aa n (a)a 2n (a)a 3n (a)a 4n (a)) is 
negative for every non-zero n. 

This establishes the k = 5 case of Theorem |1.13[ A similar argument holds for all 
larger odd values of k, which we leave to the interested reader; we restrict here to 
the case k = 5 simply for ease of notation. 

To prove this theorem, our starting point is the following result of Bergelson, Host, 
Kra, and Ruzsa [3]: 

Theorem 2.8. For any 5 > 0, there exists a measure-preserving system (X, X ! , fj,, S) 
and a measurable set A C X with < fi(A) < S such that 




fi(A n s n (A) n s 2n {A) n S 3n (A) n s in (A)) < ^{A) 



100 




li(AnS n (A))=»(A) 2 



Proof. This follows from [3J Theorem 1.3] (see also the remark immediately below 
that theorem). The property (20) is not explicitly stated in that theorem, but 
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follows from the construction in [4] Section 2.3] (the system X is a torus (M/Z) 2 
with the skew shift S : (x, y) <— > (x + a, y + 2x + a), and the set A has the special 
form A = (R/Z) x B for some set B). □ 



We apply this theorem for some sufficiently small S (to be chosen later) to obtain 
X, /i, S, A with the above properties. We will combine this with the group G, the 
automorphism T, and the elements eo, e±, e%, e^, arising from Proposition B.9 as 
follows. 

First, we create the product space L°°{X G , d[i G ), whose cr-algebra is generated up 
to negligible sets by the tensor products ® geG fg, where f g e L°°(X, dfi) is equal 
to 1 for all but finitely many g. This product has a unitary, trace-preserving action 
U of G, defined by 

U(h)(g)f g :=(g)f h -i g . 

gee g£G 

We can therefore create the crossed product A4 := L°°(X G , d/j, G ) XjjG. Note that 
if we embed L°°(X, fi) into L°°{X G , d^jP) by using the identity component of X G , 
we have 

(21) (g) fg = II U (9)f 9 

gee g eG 

(note that the U(g)f g necessarily commute with each other.) 

We define a shift a on Ai by requiring that 

o(®/ tf ) = ®5(/ T - lj ,) 
gee gee 

and 

Ot{g) = Tg; 

one can check that this is indeed a well-defined shift on Ai. 

We claim that a is ergodic. Indeed, if a 6 Ai is of the form a = fg for some 
/ G L°° {X G , d/j, G ) and g E G not equal to the identity, then as the powers of T have 
no non-trivial fixed points, the orbit T n g escapes to infinity, and the orbit a n (a) 
converges weakly to zero. Meanwhile, if g is the identity, then it is classical that the 
Bernoulli system G O L°°(X G , d/i G ) is ergodic, and so the ergodic theorem applies 
to a in this case. Putting the two facts together and arguing as for the ergodicity 



in Theorem 2.1 yields the ergodicity of a. 



Note that 1a lies in L°°(X, dfi), and can thus be identified with an element of Ai 
by the previous embedding. We set 

3 

a ■■= ^ 1^ • (2 - ej - e" 1 ) • 1 A - 

Clearly a is non-negative. Now let n be non-zero, and consider the expression 

(22) T(aa n (a)a 2n (a)a 3n (a)a 4n (a)). 

Expanding out a, we obtain a linear combination of terms of the form 

T(lAgo^A^S n (A){T n ^3l) 1 S"(A)ls 2 "(A)(7 l2 ™ g2)^S^{A)^S^{A){T Zn g-i)ls 3n (A)^S^{A){T An gi)l S ^{A)) 
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where 

50,51:52,33,54 € {id, e , ei, e 2 , e 3 , e 4 , e^ 1 , e^f 1 , e^ 1 , e^" 
This trace vanishes unless 

(23) g T n g 1 T 2n g 2 T 3n g 3 T 4n g 4 = id. 

By Proposition |B.9| we conclude that 50 , 5i , 52 , 53 , 54 are either all equal to the iden- 
tity, or are a permutation of eo, e%, e%, e3, &4,, or are a permutation of e^ 1 , , e^ 1 , e^ 1 , 
In the latter two cases, the contribution to (22) is either zero or negative (being 
negative the trace of the product of several non-negative elements in a commuta- 
tive von Neumann algebra). Here we are using the fact that 5 is odd. Discarding 
all of these contributions except the one where g^o = e^o (which has a non-trivial 



contribution thanks to Proposition B.9|, we conclude that (22) is at most 

10 5 r(l J 4l s „( j4 ) ls^n(A) ls 3 "(A) Is 4 " (A)) 

- 7"(lAeolAls' l (A) e lls"(A)ls 2 "(A)e2ls 2 "(A)l5 3 "(A)e3ls3"(A)ls 4 '*(A)e 4 ls 4 "(A)). 



By Theorem 2.8 the first expression is at most 10 /J>(A) . Now consider the 
second expression. By Proposition B.9 we see that the partial products eoei ■ ■ ■ &% 
for i = 0, 1,2,3 are distinct. Using (21), we conclude that the trace here can be 
computed as 

f i(S in (A)nA) l i(AnS n (A))fi(S n (A)nS 2n (A))fi(S 2n (^^ 

which by (20) is equal to n{A) w . Thus the expression (22) is at most 2 15 /j,(A) lm - 
[i(A) 10 , which is negative if the upper bound 5 for [i(A) is chosen to be sufficiently 
small. 

This concludes the proof of Theorem |2.7| 

Remark 2.6. Given that the counterexample in Theorem |2.8| can be extended to 
any k > 5, it seems reasonable to expect that Theorem 1 1.1 3 can be extended to all 
k > 5 (not just the odd k), though we have not pursued this issue. On the other 
hand, the analogue of Theorem 2.8 fails for k = 4, as was shown in [4 . Because of 
this, the k = 4 case of Theorem 1.13 remains open; the construction given here does 
not work, but it is possible that some other construction would suffice instead. < 



3. Inclusions of finite von Neumann dynamical systems 



In this section we quickly recall some fairly well-known constructions relating to von 
Neumann dynamical systems and their basic properties, culminating in a treatment 
of Popa's noncommutative version of the Furstenberg-Zimmer dichotomy from [52] , 



This material will be needed to establish the structure theorem (Theorem 1.9) 



Let (M, t) be a finite von Neumann algebra. As noted in the introduction, we can 
embed M. into a Hilbert space L 2 (t). In order to distinguish the algebra structure 
from the Hilbert space structur^J we shall refer in this section to the embedded 



^It is tempting to ignore these distinctions and identify A4 with M. While this is normally 
qutie a harmless identification, we will take some care here because we will be studying the bi- 
module action of A4 on L 2 (t), and keeping track of this action can become notationally confusing 
if the algebra elements are identified with the vectors that they act on. 
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copy of an element a E A4 of the algebra in L 2 (t) as a rather than a, thus for 
instance M — {a : a & M} is a dense subspace of L 2 (t). 

Clearly, L 2 (t) has the structure of an .M-bimodule, formed by extending the reg- 
ular bimodule structure on A4 by density; the left-representation is, of course, the 
classical GePfand-Naimark-Segal representation associated to r. When it is neces- 
sary to denote the copy of AA in B(L 2 (t)) consisting of the members of AA acting 
by multiplication on the left (respectively, right), we will denote this algebra by 
M\ e ft (respectively, M righ t)- 

The space L 2 (r) contains a distinguished vector 1 - the representative of the mul- 
tiplicative identity 1 in AA - with the property that al = la = a for all a £ AA. 
This vector will play a prominent role in the rest of this section. 

Now let (Af, t|jv) be a von Neumann subalgebra of (AA,t) (with the inherited 
trace). Then we can canonically identify L 2 (t\n) with the closed subspace 

{b : b e Af} = Afl = lAf 
of L 2 (t) in the obvious manner. 

We will make use of certain well-known properties of these constructs, which we 
merely recall here. A clear account of all of them can be found in [231 Chapters 
1,3]. 

First, it is important that there is a simple necessary and sufficient condition for a 
vector £ € L 2 (t) to lie in the dense subspace AA: this is so if and only if the linear 
operator 

M -> L 2 (t) :x^x£ 
is bounded for the norm || ■ ||l 2 (t) ; and so extends by continuity to a bounded 
operator L 2 (t) —> L 2 (t). The necessity of this conclusion is clear, and its sufficiency 
requires just a little argument using the fact that for a finite von Neumann algebra 
(M,t) we have M right = ■Alright an d A4 loft = M" eit ; see [531 Theorem 1.2.4]. 

A simple application of this condition now shows that the orthogonal projection 
e_\f : L 2 (t) — > Afl maps the dense subspace M. into Af, and so defines also a linear 
operator Ej^ : M. — > Af. Indeed, for a E A4 we need only to show that the map 

M -> L 2 (t) : x i — y xejj-(a) 

is bounded for the norm || • 11^2^. Since Af is also a von Neumann algebra and 

ej\/(a) G A/1 = L 2 {t\n), it actually suffices to check this for x € Af. However, since 

Afl is an (Af, A/")-sub-bimodule, left multiplication by x commutes with ej^, and so 
we have 

\\xex(a)\\ L 2( T) = ||ejv-(xa)|| L 2( T ) < ||xa|| L 2 (T) < ||a|| |jx|| L 2 (T) , 

as required. 

The linear operator Ej^ is referred to as the conditional expectation of AA onto 
associated to r, and it has the following readily-verified properties: 

Lemma 3.1 (Properties of conditional expectation). For all a £ AA., the operator 
Ejsf satisfies: 
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• (Idempotence) Ej^{Ej^{a)) = Ejy(a); 

• (Contractivity) \\Ej^(a)\\ < ||a||; 

• (Trace-preservation) t\j^(Ej^(o)) = r(a); 

• (Positivity) E^/(a*a) > (as a member of Af); and 

• (Relation with e^f) For all £ G L 2 (t), one has 

ejvWejvte))) = £7v(o)(e^(0) = ejv(-M«)(0)- 

Example 3.1. If .M = L°°(X, A", /i) for some probability measure /i with the usual 
trace, and (Y, y,v) is a factor space of (X,X,jj) with a measurable factor map 
7r : X — > Y that pushes /i forward to then L°°(Y, y, v) can be identified with 
a subalgebra of Ai, and the conditional expectation map becomes its classical 
counterpart from probability theory. < 

Together with Ai, the orthogonal projection ejv now generates in B(L 2 (t)) a larger 
von Neumann algebra (Ai, e/v") D A1. In general (Ai, e/v") is no longer a finite von 
Neumann algebra, but it does contain the dense *-subalgebra A := lin(A^U{a;e^y : 
x, y € Ai}) on which we define the lifted trace f : A — > C by specifying f(xej^y) = 
r(xy). By choosing an orthonormal basis for L 2 (t) relative to the right action of 
TV, and consequently realizing (Ai , ej\f) as an amplification of J\f, this linear map is 
seen to be non-negative and faithful, and hence defines a semifinite normal faithful 
[0, +oo]-valucd trace (which we still denote by f) on the cone ({M., e^f)) + of non- 
negative (and self-adjoint) elements of (A4,ejv). This witnesses that the algebra 
(Af, ejv) is semifinite (that is, any positive element of it may be approximated 
from below by finite-f positive elements). We will not spell out these standard 
manipulations here (see, for instance, |32( Section 1.5]), but we will invoke a notion 
of orthonormal basis for right- A/-submodules of L 2 (t) shortly. 

Remark 3.2. In case J\f C M. is a finite-index inclusion of finite Hi factors, then we 
find that (At,ejv) is also a finite Hi factor. Writing M.\ for this factor, it follows 
that the above construction may be repeated with the inclusion Ai <-> M i in place 
of N Ai, and indeed that it may be iterated to form an infinite tower of Hi 
factors 

AT c M c Mi c Ai 2 c .... 
This is Jones' basic construction; it underlies his famous work |25j on the possible 
values of the index [Af : Ai], and also several more recent developments. Once again 
we refer the reader to |24j for a thorough account of its importance, and numerous 
further references. However, since the construction of this whole infinite tower is 
special to the case of Hi factors, we will not focus on it further here. < 

It is easy to check that the right action of any n € Af commutes with any xe^y, and 
hence with any member of (Ai,e_\f), and in fact it can be shown that {Ai,ej^)' = 
Wright and hence that Af{ igU = (Ai, ejv)" = (Ai,e^): firstly, if A e B(L 2 (t)) 
commutes with every b € A^i e ft then it must be the right-action of some a £ Ai, 
and now if also e^/(la) = la then we must in fact have a € Af (see Proposition 
3.1.2 in [24] ) . Let us record the following immediate but important consequence of 
this for our later work: 

Lemma 3.2. If V < L 2 (t) is a closed right- Af -submodule, then the orthogonal 
projection Py : L 2 (t) — > V is a member of (Ai,ejsf). □ 
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Using t we can also define an alternative completion of A = for each 

p G [1, oo) by setting || A\\ Pt f :— Kj t {{A* A) p / 2 ) for A 6 A (where as usual the power 
(A*A)p/ 2 is defined using spectral theory for the selfadjoint operator A* A, and the 
non-negativity of f is used to show that f {{A* A) p / 2 ) is finite even when p/2 is not 
an integer). We denote this completion by L p (f); it is a Hilbert space when p = 2. 
In general elements of L p (f) do not correspond to elements of (Al,ejv), but they 
do give possibly unbounded but closable operators that are weakly approximable 
by members of this algebra, which are therefore affiliated to A/right- If -A € L p (f) is 
such an operator that is self-adjoint, then it admits a spectral decomposition 

A= ( sP(ds) 
Jr 

for some spectral measure P on K taking values in the projections of (A4,e_\f) n 
L (t), of possibly unbounded support in K, but for which 

mn^= / M p fP(d S )<co. 

Jr 

If V is as in Lemma |3.2| then we may write that Py has finite lifted trace if it 
corresponds to a member of (M,e^/) n ^ 1 (t). 

Now let us introduce some dynamics. Suppose that a is a shift on M. which 
restricts to a shift on Af. Then, as mentioned in the introduction, a induces a 
unitary operator acting on L 2 (r), which we shall distinguish from a by writing it 
as U a ] thus for instance 

U a a = U a (al) = a(a)l = a(a) 

for all a G M. It is clear that J\fl is an invariant subspace for U a , so that U a 
commutes with e^. Also, conjugation by U a agrees with the action aonM, thus 

U a aU~ ^ = a(a)£ 

for all a e M and ^ e L 2 {t). Thus, conjugation by t/ Q extends the action of a to 
(X,eAT>. 

The following special class of one-sided submodules of L 2 (t) appears here almost 
exactly as in the commutative setting. 

Definition 3.3 (Finite-rank modules). A left- (respectively, right-) Af-submodule 
V of L 2 (t) has finite rank if there are some £2, ■ ■ ■ , £r G V such that V = 
$^i=i A/"^i (respectively, V = Xh=i £iN)> an d the numerical value of its rank is the 
least r > 1 /or which this is possible. 

Proposition 3.4 (Relativized Gram-Schmidt procedure). If V < L 2 {t) is a U a - 
invariant right-N ' -submodule of finite rank r then there are Ci ( 6> • ■ • i (r € ^ 2 ( T ) 

• the subspaces £iAf < L 2 {t) are pairwise orthogonal; and 

Proof. This uses a relativized Gram-Schmidt argument much as in the commutative 
setting (see e.g. [TBI Lemma 9.4]). We proceed by induction on r. If V has rank 1 
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then the result is immediate from the definition, so let us suppose that it has rank 
r + 1 for some r > 1. Then given a representation 

r+l 

1=1 

we know that any member of V may be approximated in |j • ||l 2 (t) by expressions 
of the form £°ni + • • • + f° r n ii n 2, ■ ■ • , n r+i € A/". This, in turn, may be 

re-written as 

&ni + ■■■+ &n r ) + ((£ - 6>i + ■•• + (£- ^K) + C+i^r+i 

where for each i < r we have decomposed £° into its component ^r 1 orthogonal to 
£ r +iA/" and the remainder £° — ^ e £ r+ iAf. Since £ r +i7V is a right -A/"-submodule, 
it follows that the second and third inner sums in the above decomposition both 
lie in £ r+ iAf, and now since £ r+ iAf is also a right-A/"-submodule, we have in fact 
shown that 

V = Vx + 6m~lA/" 

where V\ := Yli=i ^ s a ran k-r right-A/"-submodule that is orthogonal to £ r +iAf. 
Applying the inductive hypothesis to V\ now completes the proof. □ 

The following definition is also drawn from the commutative world. This notion 
has previously been extended to the setting of non-commutative algebras by Popa 
in [32] . who discusses several other aspects and equivalent conditions in that paper. 
(See also [3T], [TT], [B] for an analysis of the absolute analogue of weak mixing, in 
which the subalgebra N is the trivial algebra CI.) 

Definition 3.5 (Relative weak mixing). If (M, r, a) is a von Neumann dynamical 
system and Af C Ai is an a-invariant von Neumann subalgebra, then a is weakly 
mixing relative to Af if for any a £ Ai fl Af we have 

1 N 

— ^2\\E A f(a*a n (a))\\ 2 T ^0 as N -> oo. 

n=l 

The basic inverse theorem that we need, extending the idea of Furstenberg and 
Zimmcr to the non-commutative context, is contained in the following proposition, 
which essentially re-proves part of [32j Lemma 2.10]: 

Proposition 3.6 (Lack of weak mixing implies finite trace sub-module). If a is 

not weakly mixing relative to Af then there is a U a -invariant right- Af -submodule 

V < L 2 (t) OAfl such that Py has finite lifted trace. 

Proof. Suppose that a £ A4 <~) Af is such that 

1 N 

-^||^(a*«"(a))||^0. 

n=l 
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Define b := ae_\fa* G (A4,ejv), and now observe (using the cyclic pcrmutability of 
f and the identity ejvmejv = E^/(m)ej^) that for any neNwe have 

T(b(U%bU- n )) = ?{ae N a*U2{ae N a*)U- n ) = f{ae M a* a n {a)e N a n {a)*) 

= f{E N (a*a n {a))e N OL n (a)*a) = \\E M (a*a n (a))f r . 

Averaging in n it follows that 

1 N 

where b\ is the limit of the ergodic averages h Yln=i a "(^) m the Hilbertian com- 
pletion L 2 (f), which is therefore invariant under the further extension of the unitary 
operator U a to this Hilbert space. 

This new element b\ need not, in general, correspond to a member of (A4, e_\f) (it is 
easily seen to be so in the commutative setting, but for special reasons); however, 
as a || • l^r-hmit of members of (A4, e^f) = A/" r ' ight it can always be identified with 
a closed operator on L 2 (r) that is affiliated with the right action of the algebra AT, 
and as such it admits a spectral decomposition 



h = 



/ sP(ds) 
Jo 



for some resolution of the identity P on [0, oo) whose contributing spectral projec- 
tions lie in (M , e j^) , and for which 

r(P(ds)) = H&xlll^ < oo. 



o 

Hence fP(I) < oo for any Borel subset / C (0, oo) bounded away from 0. Now 
choosing any such subset I for which P(I) ^ gives an orthogonal projection 
P(J) G (M, e_/v") of finite lifted trace that is C/a-invariant, commutes with the right- 

A/"-action because it lies in (M. , ejv) , and moreover has image orthogonal to lAf 
because we initially chose b to lie in the orthogonal complement of this subspace. □ 

Remark 3.3. The above implication can in fact be reversed, and these conditions 
shown to be equivalent to a number of others; see |32[ Lemma 2.10] for a more 
complete picture. < 

In the next section we will push the above results a little further under the additional 



assumption that the subalgebra is central, leading to the proof of Theorem 1.9 



4. The case of asymptotically abelian systems 



We now specialize to the case of an asymptotically abelian system, with the crucial 
additional assumption that the subalgebra Af is central. 

Lemma 4.1. Suppose that [M., t, a) is a von Neumann dynamical system, Af C M. 
is an a-invariant central von Neumann subalgebra and V < L 2 (t) is a U a -invariant 
right- Af -submodule of finite lifted trace. Then for any e > there is a further Ua- 
invariant right- Af -submodule V± < V such that 
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• t(P v - PyJ < e; 

• Vi has finite rank, say r > 1; 

• i/iere are an orthogonal right-N -basis £i, £2, ■ • • , £r an d a unitary matrix 
of unitary operators U — ('Uji)i<i,j<r € U rxr (Af) such that 

r 

U a ((ii) = ^2^jUji Vi = l,2,...,r. 
W^e re/er to U as the cocycle representing the action of U a on the basis elements 

Proof. We will prove this invoking the picture of the representation of AT on L 2 (r) 
as a direct integral coming from spectral theory. By the classical theory of direct 
integrals (see, for instance, [13 Chapter 14]), we can select 

• a standard Borel probability space (V, u); 

• a Borel partition Y = {J n>1 Y n U Y^; 

• a collection of Hilbert spaces Sj n for n € {1, 2, . . . , 00} with dim(f) n ) = n; 
and 

• a unitary equivalence 

/•ffi 

$ : L 2 (r) ^ := y %^(dy), 

where we define f)j, to be S) n when y G Y n , 

such that A/" (acting on the right or left, since these agree for a central subal- 
gebra of M.) is identified with the algebra of functions L°°(v) acting by point- 
wise multiplication. Explicitly, if we denote elements of fj as measurable sections 
v : Y — > IJ y ey S} y , then / e L°°(i>) acts on f) by 

M f (v)(y) := f(y)v(y). 

Moreover, in order to accommodate $(A/T) we select a measurable section vq E S) 
with ||t>o(2/)||.Q = lj arL d now A/1 is identified with 

{y ^ f(y)v (y) ■ /ei^)}, 

so that the orthogonal projection <!>ev<I> -1 acts by 

$ejv$ _1 (u)(j/) := («(v)>0o(v)>«„ ■ «o(»). 

The larger algebra Alright is identified under $ with a direct integral 




M y v(dy), 



so that elements of $(A / 1) are expressed as measurable sections T : Y — > U yeY B($) y ) 
acting by 

Tv(y) := T(y)(v(y)) 

and such that T(y) £ Ai y ;/-almost surely, where {■M. y ) y ^y is a measurable field of 
finite von Neumann subalgebras of B(9) y ) for each of which the state 

M y ^C:T^ (v (y),T(v (y)))^ 
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is a faithful finite trace; overall we have 

r(o) - (l,oT) = J^(v (y)^(a)(y)(v (y)))^ y v(dy) 

for a e M, and so in particular if n € N then <5>(?i) £ L°°(fi) and r(n) = / $(n) d^. 
Given these data, for a,b £ Ai we can compute that 

$(aejv&)$ _1 t;(tf) = (<S>(b)(y)(v(y)),v (y)) ■ $(a)(y)(v (y)) 

and 



r(ae^b) = r(ab) = J (v (y),®(ab)(y)(v (y))}sj y K d 2/) 

ma*)(y)(vo(y)),m(y)(My)))fi y v(<ly) = f tr^ae^- 1 ^)^). 



In this representation an A/"-submodulc V < L 2 (t) corresponds to a subspace 
<J>(V) < fj of the form fy V y v{dy) for some measurable subfield of Hilbert spaces 
V y < F) y , and the above calculation now shows that 



f(P v ) = J dim(V y )v(dy), 

so Py has finite lifted trace if and only if the function y t— > dim(Vy) is ^-integrable. 

We can enhance this picture further by noting that since a preserves Af it must 
correspond to some ^-preserving transformation S r\ Y , and that since it also 
preserves M. and extends to a unitary operator on L 2 (r) it must also preserve each of 
the cells Y n . Similarly, since V is C/ Q -invariant, the transformation S must preserve 
the function y i-> deg(V a ). It follows that the unitary operator $?7 a $ _1 on L 2 (t) is 
actually given by a measurable section of unitary operators \& : Y — > Yl y eY^(^v) 
such that 

W a $- 1 v(y) = *(y)(v(S- 1 y)). 
Now, since y H t deg(V y ) is i^-integrable, for sufficiently large r > 1 we know that 
/ deg(V y ) v{dy) < e. 

JiveY: doe(K)>r> 



Define 



W:= / V y u(dy)@ / {0}v(dy) 

J{yeY: dcg(V,,)<r} J{yeY: dcg(V y )>r} 

and Vi := ^ 1 (W). Clearly V\ is still a right -./V-submodule that is U a -invariant, 
and it clearly also has rank at most r (since it suffices to prove this for W, for which 
it follows by a relativized Gram-Schmidt construction of a fibrewise-orthonormal 
basis exactly as in the setting of commutative ergodic theory; see for instance (T5I 
Lemma 9.4]). Also, we have 

f(P v -P Vl )= [ deg{V v )v{dy)<e. 

J {yeY: dcg(V y )>r} 

Finally, the selection of unitaries \& must preserve the field of subspaces V y above 
the S-invariant set {y € Y : deg(V a ) = s} for each s < r. Choosing an abstract 
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d-dimensional Euclidean space Wd for each d < r and adjusting each fibre of W 
by a unitary in order to identify each V y for which dmi(V y ) < r with W^wy ), 
we obtain a new representation of V\ as a right -jV-submodule using these fibres 
Wd, so that the action of U a is now described by a measurable family of unitaries 
ty'(y) G l^{Wdm\(v y ))- Picking an orthonormal basis for each Wd, writing these 
unitary operators as unitary matrices in terms of these bases, noting that their 
individual entries are now identified with elements of = <&(Af) and carrying 

everything back to L 2 (t) using <i>~ 1 gives the desired expression for U a . □ 

Remark 4.1. Frustratingly, both the fact that a C/ Q -invariant V of finite lifted trace 
may be approximated by a U a -invariant V% of finite rank, and the fact that given 
such a module of finite rank the action of U a on it may be described by a unitary 
element in U(M rxr (Af)), seem to be difficult to prove without the assumption that 
Af is central and the resulting representation of the action of Af on L 2 (n) as the 
multiplication action of some L°°{y) on a measurable field of Hilbert spaces. It 
would be interesting to settle this issue more generally: 

Question 4.2. Do these conclusions hold for a finite-lifted-trace invariant submo- 
dule corresponding to an arbitrary inclusion of finite von Neumann algebras with a 
trace-preserving automorphism? < 



Before moving on let us quickly note an important difference from the setting of 
abelian von Neumann algebras. 

Example 4.2. If AA is abelian, then from commutative ergodic theory it is well- 
known that all the intermediate [/^-invariant submodules V < L 2 (t) that have 
finite-rank over Af together generate an intermediate subalgebra between Af and 
A4, and that this then corresponds to an intermediate measure-preserving system. 
We will see shortly that an analogous conclusion can sometimes be recovered in the 
asymptotically abelian setting, but it is certainly not true for general finite-rank 
submodules, even when the smaller algebra Af is abelian. 

Consider, for example, the inclusion i : L7L = L^lmj) t -» LF2 corresponding to 
the embedding of 1 as the cyclic subgroup a z of the free group F2 = (a, 6). Here 
LG is the group von Neumann algebra of G, defined in Section |2.1| In this case 
we can identify L 2 (t) as £ 2 (F 2 ) and L 2 {t\j^) as the subspace spanned by {£a™}n£Z- 
Now define a € Aut LF2 simply by lifting the group automorphism of F2 that fixes 
a and maps b M> ba. Now the subspace V := lin{£b a i. : n G Z} < ^ 2 (F2) is a 
[/^-invariant right jV-module of rank one which is orthogonal to L 2 (r\^f). On the 
other hand, although G Mf~)V, we have a m (£ 2 ) = a m (£52) = £ba™ba™ for meZ, 
and it is easy to see that these elements of A4 do not remain within any finite-rank 
right -A/"-submodule. 

It is true that if L 2 (t) 0L 2 (t|^) contains a finite-rank right -A/"-submodule V, then 
it also contains a finite-rank left-A/'-module in the form of J(V), where J is the 
modular automorphism on V", defined by extending the conjugation map a 1— > a* 
on AA = AA by density. The point is that it can happen that J(V) _L V, and that 
all elements of J(V) are weakly mixed by U a : it is the right-module V, and no 
other, that serves as the obstruction to overall relative weak mixing coming from 
Theorem n~8l < 
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We now introduce a useful technical concept. 

Definition 4.3 (Central vectors). A vector £ £ L 2 (t) is central if m£ = £m for 
all m £ Ad. 

Lemma 4.4 (No non-obvious central vectors). The closure Z(A/l)l = lZ(Ad) is 
equal to the set of all central vectors in L 2 (r) . 

Proof. Suppose that £ £ L 2 (t) is central. Define : A41 — > L 2 (t) by a^(ml) := 
£m. This is a densely-defined linear operator on L?(r), and it is closable because if 
m n l — lm n —¥ in || • ||l 2 (t) f° r some sequence (m n ) n >i in AA and also £m„ — > £' 
in || • || L 3 r T \, then we have 

(m'T,£') = lim (m'T,£m„) = lim (lm* (m')*f) = 

n— >oo n— >oo 

for every m! E AA, and so in fact we must have £' = 0. Also, we clearly have 

a^(ml) = dj(lm) = £m = m£ = (a^(l))m = m(aj(l)) 

for every m € M, so is affiliated with both the right- and left-actions of AA on 
L 2 (t). The same therefore holds for + a| and i(aj — a|), and now these are self- 
adjoint and so each of them may be expressed as an unbounded spectral integral 
all of whose contributing spectral projections must lie in AA[ eft fl Ai[. ight — Z(A4). 
Therefore, approximating — |(a^ + a|) + |(a.£ — a|) by a sum of two large 
but bounded integrals with respect to the respective resolutions of the identity, 
we obtain a sequence of elements a n G Z(AA) such that a n — » pointwise on 

dom(clos(a^)) D .Ml, and hence such that a„l — > £ in || • ||l 2 (t)- Hence £ 6 Z(M)1, 
as required. □ 

Proposition 4.5. If (Ad, t, a) is an asymptotically abelian von Neumann dynami- 
cal system, Af is a shift-invariant central von Neumann subalgebra, and V < L 2 (t) 
is an a-invariant right-Af -submodule of AA having finite lifted trace then all ele- 
ments of V are central vectors. 

Proof. Clearly it will suffice to prove this for all finite-rank approximants V\ to V 
as given by Lemma |4.1| Thus we may assume that V actually has finite rank. Let 
£ij £2, ■ • • f £r and U — (uji)i<i.j< r £ A4 rxr (Af) be as given by the third part of 
that lemma. 

Since a is asymptotically abelian, we have for any al £ A^l and b € AA that 

N N 

- £ \\bUS(al) - U:(al)b\\ L2(T) = - \\ba n (a) - a n (a)b\\ LHr) -+ 0. 

n—1 n—1 

Approximating an arbitrary £ <E L 2 (t) by elements of Ail, it follows that for each 
fixed b £ M and £ £ L 2 (t) we have 

1 N 

lim ^£||&tC(0-E£(0&IU»(r)=°- 

n=l 
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On the other hand, we know that 



' „:;;•,! XX'"-" V?: = 1,2, . . . ,r, 

3=1 

and so, writing U n — )i<i,j< r , we have 

r r 

tc n (6) = E fc*4r n) =* & = E ^ (£»Sr } ) v< = i, 2, . . . , r. 

3=1 3=1 

Clearly each is still a unitary, and so from this, averaging in n and the 

centrality of M we obtain 

JV 

L 2 (r) 



mi-tib\\»(T) = ||^E(E & ^> n (4 n) )-E^^>>fr n) )& 

71=1 J = l j=l 
iV 7' 

= lb E E ( bu «&) - ^(C3)&)«"(^r n) ) 

n=l 3=1 
r N 

^ E^En 6C/ o(0)-^(0)&iu 2M , 

3=1 n=l 

and now since each of the summands in j tends to as N — > oo, it follows that we 
must in fact have i>£, = for every i < r, and hence (taking A/"- linear combinations, 
which have central coefficients, and then a completion) that all vectors in V are 
central, as required. □ 

Let us note explicitly the following simple corollary of the above result. 

Corollary 4.6. If {M.,t, a) is an asymptotically abelian von Neumann dynamical 
system, then the subalgebra M a := {a G A4 : a(a) = a} of individually a-invariant 
elements is central. 

Proof. Of course, if a{a) = a then lin{la} is a rank-one a-invariant submodule 
of L 2 (t) for the trivial central subalgebra Af := CI, and the claim follows from 
Proposition |4.5| This claim can also be easily verified directly from the definition 
of asymptotic abelianness. □ 

Finally we can use the above results to prove Theorem|1.9| 



Proof. (Proof of Theorem 1.9 1 Suppose, for the sake of contradiction, that a were 



not weakly mixing relative to Z{M) c M. Then Proposition 3.6 gives a non- 
trivial right-Z(.M)-submodule V < L 2 (t) Z{M)1 of finite lifted trace, and 
now Pr opos ition |4.5| tells us that V must consist of central vectors. However, 
Lemma 4.4 now gives V < Z(A4)1, implying a contradiction with our assumption 



that V JL Z(M)1. □ 



34 



TIM AUSTIN, TANJA EISNER, AND TERENCE TAO 



Note that for the results in this section it suffices to assume that for every a £ M 
there exists a sequence {nj} such that lim^oo || [a nj (a), &]||£2( T ) = for every b £ 
M. . We do not know whether this condition is strictly weaker than asymptotically 
abelianness. 



Remark 4.3. A variant of Theorem 1.9 can also be deduced from the results in [31] 
(and more specifically, Theorem 4.2 and Proposition 5.5 of that paper); we thank 
the anonymous referee for pointing out this fact. More specifically, the result is that 
if a is an automorphism of a finite von Neumann algebra M. that leaves invariant 
a faithful normal trace r, and E T is the conditional expectation to the factor 

A4 r := lin wot {a £ M. : a(a) = Aa for some A £ T}, 

then for any a, b £ M. one has 

i N 

J im v E \(Er(aa n (a)) - E T (aT a n {E T {a)),b) l , (t) \ = 0; 

N— ¥ao IS * — ' 

n=l 

in particular, for N going to infinity along a density one set of integers, the ex- 
pression E T (a* a n (a)) — E T (a)*a n (E T (a)) converges to zero in the weak operator 
topology. This property is weaker than the relative weak mixing property with 
respect to this factor (which one does not expect to hold in general, even in the 
abelian case) , but on the other hand does not require any hypothesis of asymptotic 
abelianness. 



5. Triple averages for non-asymptotically-abelian systems 

The use to which we put relative weak mixing in the preceding section is very 
special to asymptotically abelian systems: in general there seems to be no way to 
track the error term resulting from the re-arrangement at the heart of the proof 
of Theorem |1.8| without this assumption. However, in the special case of triple 
averages this problem does simplify somewhat, provided we assume instead that 
our system (A4,r,a) is ergodic, so that Ai a = CI. In this case we will be able 
to obtain convergence weakly and in norm, as well as recurrence on a dense set 



(Theorem 1.10) 



This assumption is not so innocuous as might be expected from its analog in the 
world of commutative ergodic theory. In that setting it is possible quite generally 
to decompose a system (that is, more precisely, to decompose its invariant mea- 
sure) into ergodic components, and then many assertions about the whole system, 
including multiple recurrence and the convergence of multiple averages, follow if 
they can be proved for each ergodic component separately. However, in the set- 
ting of a general von Neumann dynamical system, this decomposition is available 
only if Ai a is central in M; otherwise the automorphism a can exhibit genuinely 
new phenomena precisely in virtue of having the nontrivial fixed subalgebra M a 
to 'move around'. This was already seen in the failure of recurrence on a dense set 



when the ergodicity hypothesis is dropped (Theorem 1.12). 



The key for convergence of triple averages is the following decomposition similar 
to the commutative case, first established (in a slightly more general setting) in 
[3"Tj (and more specifically, from Theorem 4.2 and Proposition 5.5 in that paper); 
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for the convenience of the reader we give a short proof of that decomposition here. 
Note that the result does not require ergodicity of the system. We remark that a 
closely related decomposition was also used in [13] . 

Proposition 5.1 (Decomposition of von Neumann dynamical systems). [3T] Let 
(M,t, a) be a von Neumann dynamical system. Then one has the orthogonal de- 
composition M — M r © M s , where 

M r := lin {a € M : a(a) = \a for some A e T} and 

{ 1 N 

M s := <aeM: lira — }^ \r{ba n (a))\ = for every b e M 

{ n=l 

i.e., M r is the von Neumann subalgebra spanned by the eigenvectors of a and M s 
is the subspace of the elements of M that are weakly mixed by a. The corresponding 
projection onto M r is the conditional expectation of M onto M r and in particular 
preserves positivity. 



Proof. Since the continuation U a of a to L 2 (t) is a unitary operator, the Jacobs- 
Glicksberg-de Leeuw decomposition holds for U a (see e.g. [29l Section 2.4]), i.e., 
L 2 (t) = L%(t) © L 2 s (t), where the reversible part L 2 (t) is defined as 

L%(t) — \va{x : U a (x) — \x for some A £ T} 

and the stable part L 2 s (t) is defined as the space of all x £ L 2 (t) such that 

1 N 

J im m E ira*)^>l =0 for every yei 2 (r). 

n=l 

Moreover, this decomposition is orthogonal since U a is unitary. Note that we do 
not need here the Jacobs-Glicksberg-de Leeuw decomposition in full generality 
but only its version for unitary operators, which can be also proved via the spectral 
theorem. 

By a result of St0rmer [34] , the eigenvectors of U a belong to M . We thus have M r = 
M n L 2 (t) and M s = Mr\L 2 (r). The fact that the weak operator closure and the 
closure in the L 2 (r)-topology coincide for self-adjoint subalgebras implies the second 
formula for M r and thus M r is a von Neumann subalgebra of M. The conditional 
expectation now maps M onto M r assuring the orthogonal decomposition M = 

M r ®M s . □ 

In the remainder of this section we assume our system is ergodic. 

Proposition 5.2 (Convergence of triple averages). Let (M, r, a) be an ergodic von 
Neumann dynamical system. Then the averages 

1 N 

(24) _£ a>)a 2« (6) 

71=1 

converge in \\ ■ ||i,2(t) o,s N — > oo for every a,b G M. 
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Proof. By the above proposition, it suffices to assume that a and b each belong to 
A4 r or A4 S . Suppose first that a € and fix b. The operators Sn given by 



1 N 



N 

n=L 

are linear and bounded on M. for the norm || • || i 2( T ), so we may assume that 
a(a) = Xa for some A G T. Then S^a = j^fT ^2n=o a (^ a2 ) n (&) which converges in 
L 2 {t) by the mean ergodic theorem. 

Suppose now that a € M s - We show that the desired limit is zero. Consider 
u n := a™ (a) a 2 ™ (6) 1 and observe that 

(u n ,u n+J ) = r(a 2 "(6*)a n (a*)a n +^(a)a 2n+2 J(6)) 

= r(a"(6*) a* a J (a) a n+2 ' J '(&)) = r(a* a J (a) a"(a 2j (6) 6*)). 

The ergodicity of the system implies 



N—fOD 



n=l 

r (a* at (a) ^lim i £ a > 2 '» 6*)J 



|r(a* a*(a))| ■ |r(a 2j (&) 6*)|. 

Since a G A4 S and t(o; ■?(&) &*) are bounded in j, we have 

1 W 
lim — > 7, = 0, 



and therefore by the classical van der Corput lemma for Hilbert spaces (see e.g. 
[IS] or [5]), we have limjv-^oo i YZ=i m « = °- a 

Remarks 5.1. (1) For compact non-ergodic systems the averages ( |24[ ) converge 
as well, since M = M r in this case; this was also observed in [6|. 
(2) As in the commutative case we see that the Kronecker subalgebra A4 r is 
characteristic for (24), i.e., the limit of the averages in (24) does not change 
if replacing a by Ej^ r a and b by Ejn r b. < 

As was shown in Corollary |2.4| one cannot expect the limit 

JV 



) im ^Er(aa>)a 2 >)) 



iV->oo N 

to be positive for every positive a. However, a modification extending [SJ Theorem 
5.13] is still true. 

Proposition 5.3. For an ergodic von Neumann system (M,r,a), one has 

N 



ljminf ^(ReT( fl Q n (fl)a 2n (a))) + > 



71=1 



for every 0<aeA4. In particular, one has recurrence on a dense set. 
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Proof. Decompose a = b + c with b G A4 r and c G M s as in Proposition |5.1[ with 
& > by Lemma [33] We first show that there exists a compact abelian group G, an 
open set U C G and g G G such that for the 1-step Bohr set Kjj := {n G N : g n G [/} 
one has 

(25) Rer(6a"(6) a 2n {b)) > > for every n G K v . 

Take e := i 8 \\ b p ■ Since 6 G M r , we find G N, Ai, . . . Xk G T and &i, . . . , bk G M \ 
{0} such that a(bj) — Xjbj for every j = 1, . . . , k and \\b— (b\ + . . . + bk)\\v 2 (T) < £■ 
Set now G := T fc , g := (X 1 ,...,X k ) and U := U e / {km ^\\b 3 \\){l) C T k . Observe 
that for every n such that g n G U, we have |A™ — 1| < e/(k max \\b } ;||) for every 
j = 1 , . . . , k and therefore 

\\a n (b)-b\\ L 2 (T) < \\a n (b 1 + ... + b k )-(b 1 + ... + b k )\\ L 2 {T) 

+2||6 1 + ... + 6 fc -6|| L 2 (T) 

< max ||M i2(T) (|A? - 1| + . . . + \X n k - 1|) + 2s 

ke 

< max ^—7 + 2e = 3e. 

fcmax||0j|| 

So we have by the Cauchy-Schwarz inequality 

\ T (ba n (b) a 2n {b)) - r(b 3 )\ < \ T (ba n {b) (a 2n (b) - b))\ + \r(b(a n (b) - b)b)\ 



< 



|2/lt„ 2n 



^ n (6)-6|| i2(T) + ||a"(6)-6|| L2(T) ) 
< 3\\b\\ 2 \\a n (b)-b\\ L2{r) < 9||6|| 2 £ =^, 



and (25) is proved. 



Take now V :— C4/(2femax||6 i ||)( 1 ) C U and a continuous function / : G — > [0,1] 
satisfying ly < / < ljj. Then by (25l Rer(6a™(6) a 2 ™(6)) is positive whenever 
f(g n ) 7^ and therefore 

N N 

liminf — V f(g n ) Rer(&a™(6) a 2rl (&)) > liminf — V lyfff") Rer(6a n (6) a 2 "(6)). 

n— 1 n— 1 

Since the set ify := {n G N : 5™ G V"} C Kjj is syndetic (i.e. has bounded gaps) in 
N, this implies by (|25| 



1 N 

(26) liminf - £ /( 5 ») Rer(ba n (b) a 2n (b)) > 0. 

n=l 

Next, we show that 

1 N 

(27) || • \\ lHt) - lim^ - J2 f(g n )a n (b) a 2n (c) = 0. 

°° n=l 

To do this, we first consider a character 7 G G and define 

u„ := 7 (.9 n K(6) « 2 "(c)l- 
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We have 

(u n ,u n+j ) = ^ 7 (g n+ ^(a 2n (c*)a n (b*)a n ^(b)a 2n+2j (c)) 

= j{g j )T(a n {c*) b* a J (b) a" +2j (c)) = j{g 3 )r{b* a 1 '(b) a n {a^(c) c*)) 
By ergodicity of a, 

N 



13 



lim 



N- 



1 N ( 1 N \ 

-J2(u n ,u n+J ) = T ( 9 ')r J'a'fJ) lim -^«V'(c)c*) 

n=l \ n=l / 



-(6*a J (6))| • |r(a 2j (c) c*)| 



and the assumption c e A4 S implies limjv-j-oo jf Ylj=i Ij = 0- By the van dcr 
Corput estimate we thus have 



N 



N 



71=1 n— 1 

Since the characters form a total set in C(G) and the operators 

N 



SNf:=^J2f(g n )a n (b)a 2n (c) 



are uniformly bounded on C(G), (27) is proved. Analogously one also has 

N N 

|| ■ || L3(T) - lim ^/(<?>"(c)a 2 "(&) = || ■ |U= (T) - lim -£ /(.g"K(c) a 2 "(c) = 



A->oo AT 

n=l 

The Cauchy-Schwarz inequality implies now that 

AT 



lim sup 

N— >oo 



71=1 



= lim sup 

JV-»oo 



^E/(5>(ca"(&)a 2 "( C )) 

< ||c|| i 2 (r) lim sup 



1 N 



L 2 (r) 



and analogously for the Cesaro sums of f(g n )T(ca n (c) a 2n (b)), f(g n )T(ca n (c) a 2n (c)) 
and f(g n )T(ba n (c)a 2n {c)) while 

T(ca"(6) a 2 "(6)) =r(&a"(&)a 2n (c)) = r(6a"(c)a 2 "(6)) =0 

follows from the orthogonality of A4 r and Ai s and the fact that M. r is an a- invariant 
self-adjoint subalgebra of A4. 



Combining this with ( 26 ) , we obtain by the linearity of r 

N , N 



^ inf 4 £(ReT( fl a» a 2 »)) + > liminf I £ /( 5 ")(Rer( ffl «"(a) a 2 »)) 



JV-voo AT 



n=l 
N 



= l T inf ^ E /(5")(Rer(&a"(6) a 2 "(6)))+ 



> 0. 



□ 
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6. Closing remarks 

We present some remarks concerning Problem |1.15| By Theorem |1.10[ we have a 
positive answer to this question when the invariant algebra Ai a is trivial. One can 
also extend these arguments to cover the case when the invariant algebra Ai a is 
central by representing Ai as a direct integral over Ai a , see Kadison, Ringrose [26l 
Chapter 14]. 

It is clear that if the answer to Problem |1.16| is always positive, then the same 
is true for Problem |1.15| What is less obvious is that the converse is true; if the 
answer to Problem |1.15| is always true, then the answer to Problem |1.16| is always 
true. To see this, let (Ai, r) be a finite von Neumann algebra with two commuting 
shifts ai,a 2 . We then form the infinite tensor product Ai 1, := &) ne zAi, which is 
another finite von Neumann algebra, which contains an embedded copy of Ai by 
using the coordinate of Z. Next, let G be the free abelian group on two generators 
e, /, and let U be the action of G on Ai z defined by 

tf(e)(g)a„ -(gjafc^K) 

n£Z nGZ 

and 

U(f)(g)a n :=(g)a„_i 

riGZ n£Z 

for all a n € Ai with all but finitely many a n equal to 1. If we define a shift a' to 
Ai 2, by the formula 

a a n := a 2 1 {n+1] ' a 2 n "(a n ) 

riGZ riGZ 

we then observe the identities 

a'C/(e)(a') _1 = U(e) 

and 

a'Uima')' 1 = U(fe) 
(here we use the hypothesis that a\,a2 commute). Because of this, we can define 
a shift a on the crossed product Ai 1, xiy G by declaring a to equal a' on vVf z , and 

a(e) := e; a(f) := fe. 

If ax, a,2 he in we observe that 

a n ( ai f 2 )a 2n (f- 2 a 2 f) = (a')"(ai)((«') 2n ^(e)- 2 "(a 2 ))/. 

If we assume that ai, a 2 in fact lie in A4, we can simplify this as 

aj n ( ai )ai n (a 2 )f. 



Thus, if we assume Problem 1.15 has an affirmative answer for the system M. XjjG, 
we see that the averages of a\ n (a^a 2 ™ (a 2 ) f (and hence of af n (ai)a|™(a 2 )) converge 
for arbitrary a\,a 2 € Ai; from this one easily deduces (after dividing n into even 



and odd classes) that Problem 1.16 has an affirmative answer for the system Ai. 



In particular, we see that the task of establishing Problem 1.15 in the affirmative for 
arbitrary von Neumann dynamical systems is at least as hard as that of achieving 
convergence for two commuting shifts in the abelian CcLS6, cl result first obtained in 
0. 
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One can also cover some other (non-ergodic, non-abelian) cases of Problem 1.15 by 
ad hoc methods. Suppose for instance that M. is a group von Neumann algebra LG, 
with shift a given by automorphisms a±, a 2 ■ G — > G of the group. Then one can 



affirmatively answer Problem 1.15 as follows. Firstly, by density and linearity we 
may assume that ai,a 2 are themselves group elements: a\ = g\ G G, a 2 = g 2 G G. 
We then see that the means of a" (gi)a 2n '(92) will converge to zero unless there 
exists a group element g for which 

(28) a n (gi)a 2n (g 2 ) = go 

for all n in a set of positive upper density. But such sets contain non-trivial par- 



allelograms n,n + h,n-\-k,n + h + k for h, k > 0. Applying ( 28 1 for n,n + h and 
rearranging, one obtains 

a n (g 2 a 2h (g^))^g^a h ( 9l ). 



Similarly, applying ( 28 ) for n + k,n + h + k one has 

a n + k (g 2 a 2h (g^))=g^a h ( gi ). 
Writing u :— g± a h (gi), one thus has 

a h (gi) = .9i u 



and 

If wc then write 
we see that 



a k (u) = u. 
g^a hk { gi )=ua h {u)...a^ h {u) 



a hkn ( 9l )= 9l v n 
for all n, and a(v) = v. Thus we have 

a hkn+ \ 9l )a 2hkn+ V(g 2 ) = a^ 9l (a 2hk (v)r^( 92 )) 

for any n, j. The means of this in n converge in L 2 (t) by the mean ergodic the- 
orem. Summing over all < j < hk we obtain weak convergence, thus answering 
Problem |1.15| affirmatively in this case. The same type of argument also lets one 
deal with crossed products of abelian systems by groups, in which the shift acts as 
an automorphism on the group; we omit the details. 

Finally, we remark that the results on asymptotically abelian systems, while stated 
for Z fe -systems, should in fact be valid for any commuting action of a general locally 
compact second countable (lcsc) abelian group. 



Appendix A. An application of the van der Corput lemma 



The purpose of this appendix is to establish Theorem |1.8[ Our arguments follow 
[5T1 Proposition 7.4, Theorem 7.5] closely (see also [51 Proposition 4.4] for another 
adaptation of the same argument). We may normalise ctQ to be the identity. 

We induct on k > 2. When k — 2 we know from the usual mean ergodic theorem 
for von Neumann algebras (see e.g. (5^1 Section 9.1]) that 

1 N 

— Y, an ( a )^ E M«(a) in || • ||z2 (T ), 

71=1 
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and since A4 a C J\f by the relative weak mixing assumption, we also have 
1 N 

-^"(^(a)) ~> E Ma (E M (a)) = E Ma (a) in || • \\ l , (t) , 

71 = 1 

so combining these conclusions gives the result. 

Now suppose that k > 3 and that we know the desired conclusion for any similar 
family of £ < k automorphisms. By decomposing each as (a,— £V( a i)) + -^Af ( a i) 
and expanding out the expression n^—i a ?( a i)i we nn d that it suffices to show that 
for any i < k — 1 we have 



a; _L j\f 



N fe-1 

vEII a "( ffi ^ in 



N 



|L 2 (r); 



n—1 i—1 

let us argue the case i = 1, the others following at once by symmetry 

By the Hilbert-space-valued version of the classical van der Corput estimate (see, 
for instance, [TS] or [3]) this will follow if we show that 

H N k-1 fe-1 

i=l 



H ^ N 

h — 1 n—1 i—1 



h E 4 E ^K-iK-iK-i)) • • • <("iK)) • «?(ai) • • • o2_i(a*-i)) 



H ^ N 

h=l n=l 



as ]V -> oo and then _ff — > oo. 



Let us now set &j := a"(a^(a*)) and Cj := a"(a^(ai)) to lighten notation. Having 
done so, we now set ourselves up for applying the asymptotic abelianness property 
by observing that 

h-ibk-2bk-3 ■ ■ ■ c\c 2 ■■■ = (6fe-2frfc-i6fc-3 • • • C1C2 ■■■) + ([&fc-i, b k - 2 ]bk-z • ■ • cic 2 • • • ) 

= (&fe-2&fe-3^fc-l&fe-4 ' ' ' ClC 2 • • • ) + (bk~2[bk-i,b k -3]bk-4 ■ ■ ■ cic 2 • • • ) 
+([b k -i, b k -2]bk-3bk-i ■ ■ ■ cic 2 • • • ) 



= &fc-2&fc-3&ft-4 ■ ■ • 6lClC 2 • • • C k ^ 2 (bk-lCk-l) 
k-2 k-2 

+ E x i l b k-^ b j\Vj + E M J l b k-i, c j] v j 

3=1 3=1 

where each Xj , yj , Uj and Vj for 1 < j < A; — 2 is some product of a subset of the 
elements {&.;, Ci : z < A; — 2}. 

Importantly, there is some M > such that ||xj||, \\yj\\, \\uj\\, \\Vj\\ < M for all 
j < k — 2, and not depending on n or h, while on the other hand for any j < k — 2 
we have 

[&*_!,&,•] = K-iCatiK-i)),^^^))], 
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and hence overall we have 



I N fc-2 fc-2 N 

j = l n=l 



AT 



fc-2 N 

M " E at E H[«titoE-i). («fc-i«i) n (^K-))]IU»(r) -+ o 



j = l n=l 

as iV — > oo, by the asymptotic abelianness of a^^ctj. The same reasoning applies 

to the term Y^jZi u j ; [^fe-i, c j] v ji and now applies again to show that in the scalar 
average of interest to us we may also commute &fc_2 from the left- hand- end of our 
product over to be immediately on the left of Ck-2, and then move bk-3 to Q;_3, 
and so on. Overall, this shows that 

h , N 



\ E ^ E ^K-i(«ti(4-i)) • • • a n M(al)) ■ <( ai ) • • • aJUK-!)) 

h=l n=l 
H N 

^E ^E T K( a f( a i) oi )"- a fc-i( a fc-i(°fc-iW- 1 )) 

/i=l n=l 
if AT 

= # E n E ^KKW • (a 2 aa ^"(^(aSJaa) ■ ■ ■ (a fc _ 1 a 1 - 1 ) n («ti(olfe-i)ofc-i)) 

h=l n=l 
H N 

= h E • £( Wn*^) • • • K-iar'r^LiK-iK-i))) 



/i=i 



n=l 



as TV — > 00 and then H — > 00. However, now we notice that the inner average 
of operators with respect to N here is precisely of the form hypothesized by the 
theorem, but involving only the k — 1 automorphisms aja^ 1 , j = 1, 2, . . . , k — 1, 
which still satisfy the necessary hypotheses of relative weak mixing and asymptotic 
abelianness. Hence this operator average asymptotically agrees with 

H N 

- J2 r(a h 1 (al)a 1 ■ (- ^(a^rYOM^K)) ■ ■ ■ (an^Y^^-iK-i)^-!)))) 

h=\ n=l 

H N 

= h E r(£!v(at(aI)oi) • (- 2(a 2 ar 1 ) n (Sv(a§(^)a 2 )) ■ ■ ■ K-ra^n^tiK-iH-i)))) 



n=l 



where the second equality holds because the operator average in the inner brackets 
now lies in Af, and so we apply the usual identity for conditional expectations 
T{aE N {b)) = T{E M (aE M {b))) = T {E N {a)E N {b)). 

Writing 

1 N 

s N := - Y,{^) n {E u {a h 2 {a* 2 )a 2 )) ■ ■ ■ K-^n^tiK-iH-i)), 

n=l 
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we see that ||sjv|| < C for some fixed C and all N <G N, and now combining this 
bound with the Cauchy-Schwarz inequality we obtain 

1 H 1 H 

^2k(^v(a^(o;)oi)-« n )| = —Jj|«l,(^Ar(af (01)01)1)^^1 

h=l h=l 

H 



< jj^2 c - II^JV("i(a*)ai|U s (T)- 



Finally, it follows that this tends to as -ff — > 00 by the our assumption that a\ -L N 



and the relative weak mixing hypothesis. This completes the proof of Theorem 1.8 



Appendix B. A group theory construction 



The purpose of this appendix is to explicitly describe a certain type of group, 
which we shall term a square group, generated by relations involving quadruples of 
generators. In particular, we will be able to solve the equality problem for such 
groups. Our arguments here are motivated by an observation of Grothendieck 
that groups can be identified with the sheaf of their flat connections on simplicial 
complexes, and experts will be able to detect the ideas of sheaf theory lurking 
beneath the surface of the material here, although we will not use that theory 
explicitly. 

Definition B.l (Square groups). A square base □ = (H U V, □) consists of the 
following data: 

• A set H U V of generators, partitioned into a subset H of horizontal gen- 
erators and a subset V of vertical generators; 

• A set □ C (H x V x H x V") U (V x H x V x H) of quadruples (e , e±, e%, e 3 ) 
of alternating orientation (thus if eo is horizontal then e\ must be vertical, 
and so forth). 

Furthermore, we require the following two axioms on the set □: 

• (Cyclic symmetry) If (eo, e\, e%, 63) € □, then [e\, e%, 63, eo) € □. 

• (Unique continuation) Ifea,ei € HUV, then there is at most one quadruple 
(eo, ei, e%, 63) € □ with the first two components eo and e\. 

If □ is a square base, we define the square group G\j associated to that base to be 
the group generated by the generators HUV, subject to the relations e^exeie-i = id 
for all (eo, ei, e2, 63) € □. We define the alphabet of the square base (or square 
group) to be the set H U V U H^ 1 U V~ x consisting of the horizontal and vertical 
generators and their formal inverses. 



To describe square groups explicitly, we shall need some notation of a combinatorial 
and geometric nature. Let N := {0, 1,2,.. .} denote the natural numbers. 

Definition B.2 (Monotone paths and regions). A monotone path is a finite path 
in the discrete quadrant N 2 from (0, 0) to some endpoint (n, m) that consists only of 
rightward edges (i,j) — > (i + l,j) and upward edges (i,j) — > (i,j + l) (in particular, 
the path will have length n + m). Given a monotone path 7 from (0,0) to (n,m), 
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(n,m) 




Figure 2. A monotone region, bounded above and below by two 
monotone paths. Note the horizontal and vertical convexity of the 
monotone region. 

the shadow 0/7 is defined to be all the pairs £ N 2 such that (i, f) £ 7 for 

some f > j. We say that one monotone path 7' lies above another monotone 
path 7 with the same endpoint (n, m) if the shadow of 7' contains the shadow of 
7. In such cases, we refer to the set-theoretic difference between the two shadows 
as a monotone region from (0,0) to (n,m), with 7' and 7 referred to as the upper 
boundary and lower boundary of the region respectively. 

We will also consider a monotone path as a degenerate example of a monotone 
region. Monotone regions are horizontally and vertically convex: if two endpoints 
of a horizontal or vertical line segment in N 2 lie in a monotone region, then the 
interior of that segment does also. 

Definition B.3 (Flat connections). Fix a square base □, and let f2 C N 2 be a 
set. A connection T on D, is an as signment T((i,j) ->• (i + £ H U if -1 of 

a horizontal element of the alphabet to every horizontal edge (i + £ 0,, 

and an assignment F((i,j) — > (i, j + 1)) £ V Li V^ 1 of a vertical element of the 
alphabet to every vertical edge 1— > + 1) G fi. We adopt the convention 
that r((t + := (i + U))" 1 and T((i,j + 1) := 

r((i, j) — > {i, j + I)) -1 ; where (e -1 ) -1 := e for e £ H U V of course. 

We say that the connection V is flat if for every square + + (i + 

l, j + 1) in Q, there exists an oriented loop /o,/i,/2,/3 of horizontal and vertical 
edges around the square (in either orientation) such that (r(/ ), r(/i), IX/2), T(/ 3 )) £ 
□ . 
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Figure 3. A monotone region {A, B, C, D, E, F, G} (with A = 
(0, 0), B = (0, 1), etc.) with a connection T defined by the group 
elements a, b, c, d, e, f,g,h € G\j, thus for instance T(B —¥ C) = b 
and r(C B) = b^ 1 . If for instance (a,b, g^ 1 , hr 1 ) and 



(/, e, d 



axe in □, then this connection is flat. 



A flat connection on a monotone region from (0, 0) to (n, m) is said to be maximal if 
it cannot be extended to any strictly larger monotone region with the same endpoints. 
It is reduced if there does not exist a triple (i + 1, j), (i + 2, j) or + 

1), + 2) in fi such that T((i,j) ->■ (i + l,j))T({i + l,j) (i + 2J)) = id or 
T((i,j + 1) (i,j))T((i,j + 1) -> + 2)) - id. 



In the degenerate case when f2 is just a monotone path, every connection is auto- 
matically flat, as there are no squares. 

Let r be a flat connection on a monotone region tt. Then one can integrate this 
connection to produce a map : — > Gn by setting <£>r(0,0) := id and $r(w) = 
<l>r(u)r(u — > v) for all horizontal and vertical edges (u — > v) in fi. From the flatness 
of F and the "connected" nature of fi it is easy to see that $r exists and is unique. 
In particular, we can define the definite integral \T\ of F to be the group element 
|r| := $r(^jw); where (n, m) is the endpoint of Q. 

Example B.l. The definite integral of the flat connection in Figure [3] is equal to 
abed = abfe = hged = hgfe. < 



Observe that every group element g in G\j can arise as a definite integral of some flat 
connection, simply by expressing g as a word in the alphabet H U V U H^ 1 U V -1 , 
and creating an associated monotone path and connection for that word. Later 
on we shall see that the definite integral will provide a one-to-one correspondence 



between group elements and maximal reduced flat connections (Corollary B.7). 

We have the following fundamental facts: 

Lemma B.4. Let □ be a square base, and let (n,m) € N 2 . 

• (Unique continuation) If Tl is a monotone region from (0,0) to (n,m) ; and 
7 is a path from (0, 0) to (n, m) in fL then any flat connection on Q is 
uniquely determined by its restriction to 7. In other words, ifT,V are two 
flat connections on f2 that agree on 7, then they agree on all of VI. 
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• (Maximality)Ifflo is a monotone region from (0 , 0) to(n,m), and T is a flat 
connection on VLq, then there exists a unique extension ofT to a maximal 
flat connection on a monotone region Q from (0, 0) to (n, to) containing 

n . 

Proof. We first establish unique continuation. This is best explained visually. The 
key observation is that if two flat connections on a square agree on two adjacent 
sides of a square, then they must agree on the whole square. This is ultimately a 
consequence of the unique continuation property of the square base □, and can be 
verified by a routine case check. Thus, if T, V are two connections on f2 that agree 
on 7, they also agree on any perturbation of 7 in f2 formed by taking an adjacent 
pair of horizontal and vertical edges in 7 and "popping" them by replacing them 
by the other two edges of the square that they form; note that this retains the 
property of being a monotone path. One can check that after a sufficient number 
of upward and downward "popping" operations one can cover the upper and lower 
boundaries of T, and everything in between, and the claim follows. 

Example B.2. We continue working with Figure [3j Suppose two flat connections 
r, r' on the indicated region agree on the upper boundary ABCDE, with the 
indicated connection values a, 6, c, d. By unique continuation of □, the only possible 
values available for V, V' on the remaining two edges CF, FE of the square CDEF 
are / and e. Thus we may "pop" the upper square and obtain that T, V also agree 
on the monotone path ABCFE. After popping the lower square also we obtain 
that r, r' agree on the entire monotone region. 

To prove the second claim, we simply observe that if T can be extended to two 
monotone regions 51,0' containing f2 , then by unique continuation they agree on 
the intersection £1 n f2' (which is also a monotone region), and can thus be glued 
to form a flat connection on the union £1 U fl' (which is also a monotone region^) . 
Since there are only finitely many monotone regions from (0, 0) to (n, m), the claim 
then follows from the greedy algorithm. □ 

Now we need a fundamental definition. 

Definition B.5 (Concatenation). Let T be a maximal reduced flat connection on 
some monotone region fl from (0, 0) to (n, m), and let x £ H UV U H^ 1 U V^ 1 be 
a symbol in the alphabet. We define the concatenation T ■ x of T with x to be the 
maximal flat connection V = T ■ x on a monotone region f2' from (0,0) to (n',m') 
generated by the following rule. 

• (Collapse) If x is horizontal (i.e. x G H U H^ 1 ), (n — f,rn) lies in Q, and 
r((n — i, m) — > (n, to)) = x^ 1 , then one sets (n', m') := [n — i, to), sets Q! 
to be the restriction of ft to the region {(i,j) € N 2 : i < n — 1} (i.e. one 
deletes the rightmost column ofQ, and sets V to be the restriction ofT to 

n 1 . 



One way to see this is to rotate the plane by 45 degrees, so that monotone paths become 
graphs of discrete Lipschitz functions with Lipschitz constant 1, and monotone regions become 
the regions between two such functions. 



VON NEUMANN NONCONVENTIONAL AVERAGES 



47 



(Extension) If x is horizontal, and either (n — l,m) lies outside of ft or 
r((n — l,m) — > (n, m)) 7^ x^ 1 , then one sets (n',m') := (n + l,m), and 
extends T to fl U {(n + by setting T((n,m) — > (n + 1, tti)) := .r; 

noie i/iai i/izs is still flat because it does not create any squares. One then 



extends T further by the second part of Lemma B.4 to create the maximal 
flat connection V on fl' that extends T. 

• If x is vertical instead of horizontal, one follows the analogue of the above 
rules but with the roles of n and m reversed. 

Example B.3. Imagine one concatenated a horizontal edge x to the flat connection 
in Figure |3j which we shall assume to be maximal reduced. If x is not equal to 
d , then the concatenated connection would thus extend one unit to the right of 
E to the endpoint (3, 2), and may possibly extend also to the square to the right of 
EF if there is an appropriate tuple in □ to achieve this extension. If instead x was 
equal to d~ x , then the connection would collapse to the region {A, B, C, D, G}, so 
that the endpoint is now D = (1, 2). < 

The importance of this definition lies in the fact that it gives a representation of 

Lemma B.6. Let □ be a square base, and let V be a maximal reduced flat connec- 
tion. 

• (Preservation of reducibility) For any x G H U V U H^ 1 U V , T ■ x is 
reduced. 

• (Invertibility) For any x € H U V U H^ 1 U V^ 1 , one has (T ■ x) ■ x^ 1 = T. 

• (Square relations) For any (eo, &i, £2, 63) € □, one has (((r-eo)-ei)-e2)-e3 = 

r. 

In particular, the group G\j acts on the space O of maximal reduced flat connections 
in a unique manner, sending T to T ■ g for any T € O and g £ G\j ■ 

Proof. We begin with the preservation of reducibility claim. If T ■ x is formed by 
collapsing T, the claim is clear, so suppose instead that r • x is formed by extension. 
By symmetry we may assume that x is horizontal. Let (n, m) denote the endpoint 
of r, and let fl' be the domain of T ■ x (which then has endpoint (n + 1, m)). 

Assume for contradiction that r • x is not reduced. Since T was reduced, there are 
only two possibilities: either one has a vertical degeneracy 

(29) r((n + 1, 3) -> (n + + l))r((n + 1, j + 1) -> (n + 1, j + 2)) = id 

for some (n + {n + 1, j + 1), (n + l,j + 2) £ fl 1 , or else one has a horizontal 
degeneracy 

(30) r((n - l.j) -)■ (nJ))T((n, 3) -»■ (n + = id 
for some (n - 1, j), (n,j), (n + e Q'. 



Suppose first that one has a vertical degeneracy (29). Consider the restrictions 
ri, T2 of the connection T on the adjacent squares (Jn, 3), (n, j + 1), (n + 1, j), (n + 
1,3 + 1)) and ({n,j+l),(n,j+2),(n+l,j+l),(n+l,j+2)). By construction, r u T 2 
agree on their common edge ((n, j + 1) — > (n + l,j + 1)), and Ti((n + l,j + 1) — > 
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(n + l,j)) is equal to T 2 ((n+ 1, j + 1) — > (n + l,j + 2)). By the unique continuation 
property of □, this implies that Ti and T 2 are reflections of each other, and in 
particular that Ti((n,j + 1) — >• (n,j)) is equal to r 2 ((n, j + 1) — > (n, j + 2)). But 
this implies that T is not reduced, a contradiction. 



Now suppose instead that one has a horizontal degeneracy (30). From Definition 



B.5|we know that j cannot equal to, otherwise we would have collapsed rather than 



extended T. Let < j < to be the largest j for which (30) holds. By repeating the 
argument in the previous paragraph, we see that the restrictions of T to the adjacent 
squares {(n-1, j), (n,j), (n-l,j+l), (n, j + l)} and {(n, j), (n+1, j), (n,j+l), (n+ 
1, j + 1)} are reflections of each other, which implies that ( |30| also holds for j + l, 
contradicting the maximality of j. This establishes the preservation of reducibility. 

Now we establish the invertibility. Again, by symmetry we may assume that x is 
horizontal. 



If r • x is a (horizontal) extension of T, then it is easy to see from Definition B.5 
that (r • x) ■ x^ 1 will be the (horizontal) collapse of T ■ x, which is T. Conversely, 
if r • x is the (horizontal) collapse of T, then (r • x) • x _1 will be the (horizontal) 
extension (because T was reduced), which will equal T again (by uniqueness of 
maximal extension). 

Finally, we establish the square relations. From cyclic symmetry and invertibility 
we may assume that eo, e 2 are horizontal and e%, e% are vertical. From invertibility 
again, it suffices to show that 

(r • e ) • e x = (r • ej 1 ) • e^ 1 

for any maximal reduced flat connection T. We use (n, to) to denote the endpoint 
ofT. ' 

We divide into four cases. Suppose first that T • eo is an extension of V, and that 
(r • eo) • ei is an extension of V ■ eo- Then we claim that V ■ eg 1 is an extension 
of r. For if this were not the case, then r((n, m — 1) — > {n,ni)) must equal 
but then as (r • eo)((n,m) — > (n + 1,to)) equals eo by construction, the domain 
of r • eo must include the square (n, m — 1), (n, to), (n + 1, m — 1), (ri + 1, to) with 
(r -eo)((n.+ l,TO— 1) — > (n + 1, to)) — e\ , causing (r-eo)-ei to be a collapse rather 
than an extension, a contradiction. Thus T ■ e^ 1 extends T. A similar argument 
shows that (r • e^ 1 ) • e^ 1 extends T ■ e^ 1 (otherwise T((n — 1, to) — > (n, m)) would 
equal eg , causing T-eo to be a collapse rather than an extension). It is then easy to 
verify that (r-e^ 1 ) -e^ 1 and (r-eo)-ei are the same (since they glue together to form 
a flat connection on T and on the square (n, to), (n+1, m), (n, to + 1), (n+l,m + l)). 

Now suppose that T • eo is an extension of T, but that (r • eo) • e\ is a collapse of 
r • eo- Arguing as before, we conclude that T((n, m — 1) — > (n, to)) equals e^, and 
so r • eg 1 is a collapse of T; similarly, (r • eg 1 ) • e^ 1 cannot be a collapse of T ■ e^ 1 
(this would force T • eo to be a collapse also) and so is an extension. It is again easy 
to verify that (r • e^ ) • e^ and (r • eo) • ei are the same. 

The remaining two cases (when T • eo is a collapse of T, and (r • eo) • e\ is either an 
extension or collapse of T ■ eo) are similar to the preceding two, and are left to the 
reader. □ 
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This gives us a satisfactory explicit description of a square group: 

Corollary B.7. Let □ be a square group. Then the definite integral map T i— > |T| 
is a bijection from O to G\j ; thus every group element has a unique representation 
as the definite integral of a maximal reduced flat connection. 

Proof. The surjectivity of this map was already established in the discussion after 
Definition |B.3[ so it suffices to establish the injectivity. We will establish this via 
the identity 

r = 0- |r| 

for all T 6 O, where is the trivial flat connection over the monotone region 
{(0,0)} from (0,0) to (0,0). This identity shows that T can be reconstructed from 
|T|, demonstrating injectivity. 

Let Q be the domain of T, which by definition is a monotone region from (0, 0) to 
some point (n, m). Let 7 be some monotone path in Q from (0,0) to (n, m) (e.g. 
one could take 7 to be the upper or lower boundary of f2). We label the vertices of 
7 in order as (0, 0) = (i ,j ), (h,jx), • • • , (in+m,jn+m) = (n, m). From definition of 
|L|, we see that 

|r| = r((i , jo) -► (h, ji))r((h, h) -> (i 2 , j 2 )) . ..r((i 

For each < k < n + m, let ilk be the portion of £1 which is in the region 
{(*! j) : * < *fe, j < jk}, thus f2fc is a monotone region from (0,0) to (ik,jk) which 
is increasing in k. Let T^. be the restriction of T to As T was maximal and 
reduced, each of the Tk is also. Since T n+m = T, it will suffice to establish that 

T k = • r((i ,jo) {h, ji))r((ii, ji) (i 2 , j 2 )) • ..r((i fc _i,j fc _x) (i k J k )) 

for all < k < n + m. But this is easily established by induction (the reduced 



nature of the is necessary to avoid the collapse case in Definition B.5 ). □ 



As a consequence of this corollary, we can distinguish any two elements in G\j 
from each other as long as we can express them as the definite integrals of distinct 
maximal reduced flat connections. 

B.l. Applications. We now specialise the above abstract group-theoretic machin- 
ery to the application at hand. We begin with a proposition which will be used to 



show non-convergence of quadruple recurrence (Theorem 2.1). 

Proposition B.8 (Independence of AP4 relations). Let A C Z be a (possibly 
infinite) set of integers. Then there exist a group G with elements eo,ei,e 2 ,e3, 
together with an automorphism T : G — > G, such that for reN, the relation 

(31) e (T r ei )(T 2l 'e 2 )(T 3 '-e3)=id 

holds if and only ifr£A. Furthermore, no power T k ofT with k ^ has any fixed 
points other than the identity element id. 



Remark B.4. Informally, this proposition asserts that the algebraic relations (31) 
for various reZ are independent of each other. In contrast, with progressions of 
length three (i.e. in the case k = 3) the analogous relations are highly degenerate. 
Indeed, suppose that 

(32) e (T r ei )(T 2r e 2 ) = id 
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for all r G A Then if r, r + h lie in A, we have 

e (T r ei )(T 2r e 2 ) = e (T r T h e 1 )(T 2r T 2h e 2 ) 
which we can rearrange as 

(TV)ei = T r ((T 2h e 2 )e^). 

If r, r + /i, r', r' -\-h lie in A, we thus have 

^((T 2 ^)^ 1 ) = T' r '((T 2 ' i e 2 )e^ 1 ). 

Assuming that T r ~ r has no fixed points, we conclude that (T 2h e^e^ 1 is the iden- 
tity; assuming that T 2h has no fixed points either, we conclude that e 2 is the 
identity. Similar arguments can be used to show that eo and then e\ are also the 



identity. Thus the relations ( 32 ) and the no-fixed-points hypothesis lead to a total 
collapse of the group generated by eo,ei,e2 as soon as A contains even a single 
non-trivial parallelogram r, r + h, r' , r' + h. (A variant of this argument also shows 



that if ( 32 ) is obeyed for r and r + h, then it is also obeyed for r + 2h even without 
the fixed point hypothesis.) This algebraic distinction between triple recurrence 
and quadruple recurrence can be viewed as the primary reason why recurrence 
and convergence results continue to hold for triple products, but not for quadruple 
products even under the assumption of ergodicity (which is reflected here in the 
no-fixed-points assumption). < 



Proof. We let G be the group generated by the generators e^ n for i = 0, 1, 2, 3 and 
neZ, subject to the relations 

for all n G Z and r G A As the set of such relations is invariant under the shift 
6j,n ^ ein+i) we see that we can define an automorphism T : G — > G by setting 



Tei^n '■= £i,n+i- If we then set e, := e^Q, it is clear that (31) holds for all r G A 



To see that (31 1 fails for r £ A, we observe that G can be viewed as a square group, 
with the horizontal generators {e^ n : i = 0, 2; n G Z} and vertical generators {ej j7l : 
i = l,3;n € Z} and square relations □ consisting of (eo, ra , ei )n + r , e2, n +2D e 3,n+3r) 
and its cyclic permutations for all n G Z and rei; note that the crucial unique 
continuation property follows from the basic observation that an arithmetic pro- 
gression is determined by any two of its elements ("two points determine a line"). 
If n G Z and r G" A, one sees that the connection on the path of length four from 
(0,0) to (2,2) associated to the word eo >n ei,n+r e 2,n+2r e 3,n+3r is already a maximal 
reduced flat connection (as none of the three squares that share two edges with the 



path can be completed to a square from □) and so by Corollary B.7 its definite 



integral eo^i,n+r^ 2 ,n+2rS3,n+3r is not equal to the identity, as required. 

Finally, to show that T k has no non-trivial fixed points, one simply observes that 
T k will shift any non-trivial maximal reduced flat connection to a different maximal 
reduced flat connection, and then invokes Corollary |B.7| again. □ 



Next, we establish a variant that is useful for showing negative averages for quin- 
tuple recurrence (Theorem 2.7 1. 
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Proposition B.9 (Independence of AP5 relations). There exists a group G with 
distinct elements eo, ex, e2, e^, e^, together with an automorphism T : G — > G, such 
that the relation 

(33) e Q (T r e 1 )(T 2r e 2 )(T 3r e 3 )(T 4r e i ) = id 

holds for allr € Z. Furthermore, no power T ofT with k ^ has any fixed points 
other than the identity element id. Finally, if r £ Z is nonzero, and 

50,51,52,33,34 € {id, e , e x , e 2 , e 3 , e 4 , \ e^ 1 , e^ 1 , ej 1 , 

are suc/i i/ia£ 

(34) to{T r gi){T*g 2 )(T e »to){T*g A ) = id, 

then 5o, 5i, 52, 53, 54 a^e either equal to the identity, or are a permutation o/{eo, e±, e^, e^} 
or /{ e o" 1 ' e r 1 ' e 2" 1 : e 3 _1 ' e 4 1 }- 



Proof. For each i — 0,1,2,3,4, we define G^ to be the group generated by the 

(* 



generators ej„ for j £ {0, 1, 2, 3, 4}\{i} and neZ subject to the relations 



fqi^l p (») JO p (0 p(0 pW - jH 

l od J e 0,n e l,n+r e 2,n+2r e 3,n+3r- t; 4,n+4r — lu 

for all n, r £ Z, with the convention that e?' = id for all n. This group has an 
automorphism TW : G^ — > G^O that maps e^ to e^' l+1 for all n. 

We now set G to be the product group G := G^ x G^ x . . . x G^\ and set 

e- - fe (0) e (1) e (4) ) 
for j = 0, 1, 2, 3, 4. We also set 

IW), . . . ,5 (4) ) := (T(°) S (°»,TW S W,. . „rW^), 



thus T is an automorphism on G. By construction it is clear that (33) holds. Also, 



by the arguments in Proposition B.8 no non-zero power of T'O has any non-trivial 



fixed points, and so the same is also true of T. 

Now we establish the final claim of the proposition. Suppose go, • ■ • ,54 obey the 
stated properties. Let i — 0,1,2,3,4, and let be the G^ component of gj for 
j = 0,1,2,3,4, thus 

(36) ^((^f^ )((^ = id- 

From construction of G^\ we see that for any distinct j, k £ {0, 1, 2, 3, 4}\{i}, 

there is a homomorphism (f)y k : GW —> Z to the additive group Z that maps ey n 

to +1, e£ to —1, and all other ef^ to zero for n £ Z and I £ {0, 1, 2, 3, 4}\{j, j, k} 
(note that these requirements are compatible with the defining relations (35)). This 
homomorphism is 2^0 invariant. Applying this homomorphism to (36), we obtain 

I>S(5P)=0. 

1=0 

In other words, the number of times gi for I = 0,1,2,3,4 equals e^, minus the 
number of times it equals e J , is equal to the number of times gi equals e^, mi- 
nus the number of times it equals eZ ■ Letting j, k,i vary, we thus see that this 
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number is independent of j. It is easy to see that this number cannot exceed 1 in 
magnitude, and if it is equal to +1 or — 1, then g , g±, g 2 , g%, 34 is a permutation of 
{eo, ei, B2, S3, e^} or of {e^ 1 , e^ 1 , e^ 1 , e^ 1 , el 1 } respectively. (Note that this argu- 
ment also ensures that en, ei, e.%, e$, are distinct.) The remaining possibility to 
eliminate is when this number is zero, thus each e, occurs in go, gi, g2, 93, 54 as often 
as e^ 1 . Suppose for instance that go, 91,92,93, 9a contains one occurrence each of 
eo,eo ,e\,ei . Applying (361 with i = 4 (say), and then applying the homomor- 
phism that maps e ^„ to zero, e^ n to n, to — 2n, and eg 4 ^ to n (here we use 
the identity (n + t) — 2(n + 2r) + (n + 3r) = to ensure consistency with (35)) 
we obtain a contradiction. Similarly if go, 9i, 92, 93, 94 contains any other combi- 
nation of one or two distinct pairs ej,ej . The remaining case to eliminate is if 
9o, <?i , 92, 33, 94 contains ej and ej 1 twice each for some j, say j = 0. Applying ( 36 ) 
with i = 4 again, we can use Corollary B.7 to contradict (36) (as the right-hand 
side is a definite integral of a maximal flat connection on a horizontal path of length 
four). Similarly for other values of j, and the claim follows. □ 
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